Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test all visualizers against API promises systematically #180

Closed
9 tasks
NealHumphrey opened this issue Mar 23, 2017 · 12 comments
Closed
9 tasks

Test all visualizers against API promises systematically #180

NealHumphrey opened this issue Mar 23, 2017 · 12 comments
Assignees
Labels
level: expert deep knowledge of packages required priority: medium can wait until after next release type: technical debt work to optimize or generalize code
Milestone

Comments

@NealHumphrey
Copy link
Contributor

NealHumphrey commented Mar 23, 2017

Potential testing idea - not sure how much of this is worth the time. These can be built into each visualizers unit tests, but would be good to have a basic set of checks that can be run for all visualizers.

To make sure that each new visualizer fulfills the few important API promises discussed yesterday, we should write unit tests that check that every visualizer meets them.

  • Either make a list of all visualizers or (preferred) automatically find all childeren of the Visualizer class and run test on each one.

Checks:

  • fit returns self
  • transform returns X
  • check that it can accept both numpy array and pandas dataframe
  • check that there is a docstring under the class name, and not one under init
  • use Inspect module to find all param names, and check that docstring at least contains all of these words somewhere in the docstring (to avoid undocumented params).
  • See if there is any other quick and easy formatting check we can do on the docstring. Not sure if there's anything else worth adding.
  • Use inspect module to search for anything that looks like a hard-coded hex value in the draw method.
  • Ensure finalize calls self.set_title and not self.ax.set_title.
@bbengfort bbengfort added level: expert deep knowledge of packages required priority: medium can wait until after next release type: technical debt work to optimize or generalize code labels Mar 23, 2017
@bbengfort
Copy link
Member

bbengfort commented Mar 23, 2017

If all this is possible, that would be incredible and really contribute to my peace of mind! I think you nailed the list, and I think we could come up with a heuristic checking methodology that is built into our tests, perhaps as follows:

  1. Create a VisualizerHeuristicsTestCase
  2. Explicitly register visualizers as a class attribute in the test case.
  3. Have a test that recurses across all subclasses of Visualizer and warns if any Visualizer isn't registered (we can also create exceptions later on)
  4. Create an assertHeuristic and warnHeuristic functions, the first raises an exception if the heuristic is not met and the second simply warns about the misbehaving visualizer
  5. write tests that do the checks you mentioned above using assert and warn Heuristic on each registered visualizer.

@bbengfort
Copy link
Member

So it turns out that scikit-learn has a check_estimator function: http://scikit-learn.org/stable/modules/generated/sklearn.utils.estimator_checks.check_estimator.html#sklearn.utils.estimator_checks.check_estimator

Which is discussed here:

http://scikit-learn.org/stable/developers/contributing.html#rolling-your-own-estimator

So potentially we can borrow/be inspired by this tool.

@ndanielsen
Copy link
Contributor

I'm a fan of this systematic testing approach. after it is implemented, it will be great for maintaining code and documentation quality as well as keeping testing boiler plate to a minimum

@bbengfort
Copy link
Member

@ndanielsen and @NealHumphrey -- beginning to do some planning for PyCon and this seems useful for it. Do one of you guys have time this weekend/early next week to take a crack at it? If not, no worries, I'll go ahead and make a try.

@ndanielsen
Copy link
Contributor

ndanielsen commented May 5, 2017

I'm thinking something much more basic for this using TestCase inheritance. We can use / modify VisualTestCase and systematically test every VisualTestCase that inherits from it:

What I'm thinking:

class VisualTestCase(TestCase):
viz_class = Visualizer

@skipif(self.viz_class is None)
def test_fit_returns_self(self):
    # init Visualizer
    vis = self.viz_class()

    # run fit
    viz_self = viz.fit(X, y)
    self.assertIs(vis, viz_self)

For existing tests, we just plug in viz_class with the Visualizer being tested. This is a basic, crude example but I think it would save us from having to develop testing paradigm for this

@bbengfort @NealHumphrey

@ndanielsen
Copy link
Contributor

I might have a good starting implementation of this by early / mid next week

@bbengfort
Copy link
Member

@ndanielsen I don't find that crude at all and I think it's a good idea so that we have the checks tested automatically. Once you get your preliminary insertions together, I can formulate the check_visualizer utility in the main package, since that's how we'll support deprecation eventually, and then we can simply have the VisualTestCase use it. But as a way to get started, I say you go ahead with this methodology.

@bbengfort bbengfort added this to the Version 0.4.1 milestone May 10, 2017
@bbengfort bbengfort self-assigned this May 11, 2017
@bbengfort
Copy link
Member

Ok, framework is in place -- but the specific checks were actually kind of difficult to implement. I'm going to punt on this for now, but you guys can take a look to see if the tests.checks module makes sense.

@bbengfort bbengfort modified the milestones: Backlog, Version 0.4.1 May 22, 2017
bbengfort added a commit that referenced this issue May 22, 2017
@ndanielsen
Copy link
Contributor

Just found this interesting package, we might be able to modify for our doc string conventions:

https://github.com/PyCQA/pydocstyle

@bbengfort
Copy link
Member

Nice, I'll have to put that on my list of tools to use!

@rebeccabilbro
Copy link
Member

@NealHumphrey @bbengfort @ndanielsen — I'm going to archive this now given that we have incorporated much of this (albeit manual) checking into #669 and the recent updates to the tests.

@bbengfort
Copy link
Member

@rebeccabilbro I think you're right that we can close this since we haven't moved forward on it in 2 years. One last piece of cleanup might be tests/checks.py -- which was a partial prototype of this automatic checking functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
level: expert deep knowledge of packages required priority: medium can wait until after next release type: technical debt work to optimize or generalize code
Projects
None yet
Development

No branches or pull requests

5 participants