GSoC - Week 7

Jul 21, 2020 by Abhishek Kumar

As with any project, it is last 20% that takes up 80% of the time.

The statement could be not more accurate about my last week’s work on corrected commit date offsets. The patch is currently failing tests for t5324-split-commit-graph and t6600-test-reach but is ready for the first round of review.


I started a mega patch that:

  1. Implemented generation data chunk.
  2. Implemented corrected commit date offsets.
  3. Re-used some pieces of code.
  4. Fixed performance regression for computing bloom filters.

The tests failed (because of course, they would). I spent a day chasing bugs around on gdb (which I have learned to use well and look forward to reading Stallman’s book after GSoC), but the number of point of failures were too many. These included:

  1. Incorrectly computing corrected commit date offsets.
  2. Incorrectly writing generation data chunk.
  3. Incorrectly writing commit-graph.
  4. Incorrectly reading commit-graph.
  5. Incorrectly parsing generation data chunk.
  6. Incorrectly traversing the graph.
  7. No longer accurate tests (as it happened in t5000-tar-tree, t5318-commit-graph, and others).

And others too.

After getting frustrated over the lack of headway, I started to rewrite the changes in small, self-contained patches (which in hindsight, I should have done since the beginning). What was once a mess touching over 13 files now is neatly laid out in tiny morsels of code changes.

Failed Tests

As I mentioned, the patch is failing tests for t5324-split-commit-graph and t6600-test-reach.

The failed test for test-reach is what I am working at the moment. We set a repository where commits are named in the form commit-x-y (where 1 <= x, y <= 10) and commit-x-y can reach commit-i-j if i <= x and j <= y.

We look up commits reachable from commit-6-6 but not from commit-3-8. This would mean commits are of form commit-i-j where 3 < i <= 6 and 1 <= j <= 6.

While the tests pass for --topo-order and --first-parent individually, when we combine them together into --topo-order --first-parent rev-list inadvertently contains commit-3-1 and commit-2-1.

Future Plans

Here’s what left for the patch series:

  1. Add an option to skip writing generation data chunk (to test whether new Git works without GDAT as intended).
  2. Handle writing to commit-graph for mismatched version (that is, merging all graphs into a new graph with a GDAT chunk).
  3. Pass tests with GIT_TEST_COMMIT_GRAPH=1. More tests might fail as I fix the current tests.

I look forward to everyone’s reviews!