Things To Do

Roadmap

improve solver and puzzle-maker for order=5
draft archive
refactor and improve canonicalization
compare backtracking and stochastic search statistics
implement grid-serdes translator tool
refine archive
create a separate github repo for data. add as submodule?
emscripten build and website
puzzle creation experiments (see research-questions.md)
optional async and batch-threading wrappers for long-running things
- hm. less sure about how necessary this is now knowing about std::async.
minlexing canonicalization
compare canonicalization methods (performance / time-complexity)
language bindings for other languages see dedicated todo-file

Periodic Checkups

reuse lint (check licensing)
Examine contents of exported symbols (nm -nCD build/<config>/lib/libokiidoku.so | less) and make sure nothing is exported that isn't intended to be.
Ensure dependencies that should be private to the library implementation aren't exposed in any way to the library interface.
Run static analyzers
Check which #ifdef __EMSCRIPTEN__ / __cpp_lib_... blocks can be removed as emscripten updates its sysroot libc++.
https://en.cppreference.com/w/cpp/language/rule_of_three
Write comments for custom CMake commands and targets

Issues I'm watching:

https://gitlab.kitware.com/cmake/cmake/-/issues/26118 cmake non-config presets don't implicitly inherit their config preset condition.
Use CMAKE_CROSSCOMPILING_EMULATOR to run cross compiled executables #554
https://gitlab.kitware.com/cmake/cmake/-/issues/15179 -Og default for debug build configuration

Misc List

can emscripten just install/package the runtime component? or how can I make it not include headers in the package? https://cmake.org/cmake/help/book/mastering-cmake/chapter/Packaging%20With%20CPack.html#cpack-and-cmake-install-commands
make repl support custom streams? or should that just be handled by caller of the progam? related: support saving session history files that can be later passed as program stdin to repro.
figure out why nanobind+asan is having issues. see also google/sanitizers#796 (comment).
./out/install/<namespace directory> := build preset name. can't see how to do this right now with there being no macro for buildPresetName in CMake Preset spec.
install nanobind stuff
create dedicated hidden CMake configure preset starting point for packaging workflow?
launch.json https://code.visualstudio.com/docs/editor/variables-reference#_input-variables

check CMAKE_LANG_COMPILER_FRONTEND_VARIANT to use msvc flags when compiler is clang-cl?

write a minimal contributing.md file. examples:
- https://github.com/libgit2/libgit2/blob/main/docs/contributing.md
profile for make_cols_valid not found in profile data.
- could this be due to anonymous/unnamed namespaces?
  - If so, how could I fix this?
- it's being inlined for both the training and optimized builds... do I need to make it not get inlined for the training build?
ctest for emscripten configuration
GCC -Wnrvo? or is it in -Wall?

usage docs for CMake:

building an app target (cmake --build . --config=Release --target okiidoku_cli)
installing an app target (cmake --install . --config=Release --component okiidoku_runtime [--prefix])

how to prevent program stack growth due to recursion for singles? consider giving CandPovs an internal data-only stack.

consider making subset finders take parameter of single subset size to try finding for instead of a ceiling-like parameter.

consider making cand masks have two lanes: one storing full O2BitArr, one storing just a byte. the byte lane is intended for use with subset finding and needs an additional array of o2x_t where indices (relative to the index of the beginning of the subset) correspond to bits of the byte mask, and the value is the value of the candidate (candidate tags). Then to remove a given candidate, can use simd to scan the candidate tags and get a byte where bits are set if the corresponding tag value matched the cand to remove, and then remove set bits of that byte from the byte storing subset candidates. https://stackoverflow.com/questions/54897297/check-all-bytes-of-a-m128i-for-a-match-of-a-single-byte-using-sse-avx-avx2

Currently avoiding using OKIIDOKU_MONO_INT_TS_TYPEDEFS in headers in classes.
- Was worried about the "header size" cost. I think I'm probably prematurely optimizing.
- give this a try
Come up with a precondition-checking strategy for language bindings. Options:
- Leave it as a user responsibility to know and follow contracts.
- Throw exception on the other language side.
think about this
Collect some statistics on how much overlap there usually is between the size-4 UA sets found. A very narrow, simple way to do this is just count how many unique cells are covered, and compare to the number of non-unique cells covered.
- links to UA set discussions:
try focusing guesses on the cell ruling out a known solution
see the TODO for get_guess_grouping.
try to use unavoidable set / deadly pattern detection to optimize proper-puzzle-making.
- interesting.
possible "deduction technique": given a cell that has multiple candidates, try guessing each candidate (to create "derivatives") and see how much more can be deduced for each derivative without further guessing after that. Any candidates that are ruled out for all the derivatives can be ruled out from the base.
consider a finder-heuristic that only starts using more powerful finders when the guess stack gets to a certain depth- at which point it unwinds to one-above the last frame that used more powerful finders, and uses more powerful finders.
- Hm. But the guess stack depth doesn't necessarily correspond to when it may make the most sense to again use more expensive deduction techniques, since I don't know if there's any saying how much further deduction a guess may enable.
- will have to create some modified version of unwinding which doesn't rule out the guess.
- or more fine grained: each guess stack frame has a "logic depth/effort" field that describes the finders that were used before the guess was made. Ex. used up to finding subsets of size 2 before the guess was made.
play around with giving puzzle maker a threshold to stop trying to solve a candidate puzzle after N guesses, or give up (or mark as try-again-later) seeing if a given is safe to remove after N guesses.
Interesting:
cppcoreguidelines C.2: use class if there is an invariant, and struct otherwise.
- hm. I mean- for a lot of my really simple structs there are invariants on integer bounds. Seems like judgement is required to choose where to draw the line with this guideline.
look into cppitertools for writing my raw loops. see if people say anything about performance overhead.
CI/CD
- CPM notes
- catch2 notes
find out how to use the cppcoreguidelines checker
- I want to use it in a way that is driven by cmake
see if grid qualities (like being a solution, being a proper puzzle, being a minimal puzzle), can be encoded through the type system and make it so that always-safe conversions (such as ) are easy, but "unsafe" (not always true) have to either go through an unsafe_cast_X_grid_to_Y_grid function, or go through a checked_cast_X_grid_to_Y_grid, which may have a non-trivial performance penalty.
- This would allow making many of the current contracts part of the type system; turning-runtime-error-into-compiler-errors-TM. I would no longer need to write such contract and post-condition comments.
- Definitely useful types: SolutionGrid, ProperPuzzleGrid, MaybeInvalidPuzzleGrid.
- Maybe useful types: MinimalPuzzleGrid.
- Things that might get ugly: the way they are, grid_is_filled and grid_is_valid would still require the caller to do some kind of unsafe-moving-cast to a new variable of a stronger type.
try DLL's again? read the docs actually.
- general rules and limitations
try writing custom GDB python pretty-printer for O2BitArr?
- how to write a gdb pretty-printer
- also look into visual studio .natvis files
- interesting: https://github.com/asarium/gdb-natvis
- https://sourceware.org/gdb/onlinedocs/gdb/dotdebug_005fgdb_005fscripts-section.html
move the emoji definitions out of the program binary and externalize as a configurable data read in at runtime?
Go back and try the old canonicalization by rel row prob, but break ties by doing some brute force: try each tied permutation and valuate it according to some reduction of how it pushes rarer rel counts to the top left. Just be careful to shift to get rid of the main diagonal seam.
- If there are multiple puddles of ties, the resolution of a puddle shouldn't depend on the resolution of any other puddle- only on the non-tied rows/columns. A consequence of this is that this resolution algorithm will not work if there are no non-tied rows/columns.

Consider: The current relabelling canonicalization method may have a big weakness: I think ties can be easily crafted: consider the "Most Canonical" solution grid- it would be all ties. How can this be addressed? (Or perhaps the "Most Canonical" grid is the only weakness?)
- Break ties by designing a way to give symbols that frequently cohabit atoms label-values that are closer together in value:
- The cohabitation table can be tiled; a coordinate's vertical or horizontal distance (these are the same, since relabelling moves both the col and row together) from the closest tiling of the main diagonal represents the distance of the labels from each other.
- For each label, make an array where each index represents another label, and the value is an object containing the left and right distances between them, and the cohabitation count.
- Hm. So far this seems to suggest a hill-climbing / brute-force-type solution...

Decide what interfaces to support:
- Probably best to start with just readline and a CLI
  - For CLI util libraries, look into using
    - repl
      - https://github.com/daniele77/cli
    - command
      - https://github.com/CLIUtils/CLI11
    - FTXUI
- Can look into ncurses in the future? Or look into options for TUI libraries?
- A web interface would be really nice.
C++20
- using enum. Might want to wait for CLANG to support?
C++23
- http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2214r0.html#does-adjacent-mean-2-or-n
- std::to_underlying() strange since cppref says all my standard library versions support it already? but it doesn't compile?
- std::unreachable()
- std::ranges::iota()
- std::views::chunk()
  - could this be useful for canon_place? I tried messing around with range-v3 on compiler explorer and had trouble with the whole action vs algorithm, container vs view thing. Didn't know what I was doing and could achieve what I wanted.
- alternative to std::chunk, look into mdspan (multi-dimensional span). Seems like this is more of what I'm looking for.
- multidimensional subscript operator
CLI
- implement -h and --help CLI argument.
- give a red message when trying to continue and nothing is left to be found.
do I want to generate doxygen docs?
- https://jothepro.github.io/doxygen-awesome-css/index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TODO.md

TODO.md

Things To Do

Roadmap

Periodic Checkups

Misc List

Files

TODO.md

Latest commit

History

TODO.md

File metadata and controls

Things To Do

Roadmap

Periodic Checkups

Misc List