- Project ideas
- Application process
- Current projects
- Older projects
- See also
- March 18 to March 29: Mentoring organization application
- April 8: List of accepted mentoring organizations published
- April 22 to May 3: Student application
- May 8: Slot allocations published to mentoring organizations
- May 9 to 22: Slot allocations trading
- May 27: Accepted student proposals announced on the Google Summer of Code 2013 site.
- June 17: official SoC start
- July 29 to August 2: midterms evaluation submission
- September 16: suggested ‘pencils down’ date
- September 23: firm ‘pencils down’ date
- September 23 to 27: evaluation submission
- October 1: final results announced by Google
- October 19-20: Summer of Code mentor summit
Here are some ideas for 2014 Google Summer of Code student projects. They probably need to be cut down to plans that could realisticly fit into the summer of code timeframe. Note that these themes are just to get you started. We welcome submissions beyond these initial ideas. Get in touch with us! email@example.com or #darcs on freenode.
All of these projects require good Haskell skills. Bonus points if you already contributed code to darcs.
This may be the project that involves less modification of the “core” of darcs, and thus may be more accessible to newcomers.
Proposed by: Guillaume
Hashed files in darcs follow the same idea as in git repositories or the bittorrent protocol: a file is saved on the disk using its own hash as name. Darcs uses them in many places. However some aspects could be improved.
Cache system: darcs maintains a global cache in
~/.cache/darcs/, that is shared between all repositories of a user. This makes many operations faster and saves disk space by using hard links. However when the cache gets too big, it becomes a problem on its own, since filesystems do not cope well with directories with zillions of files inside. The idea would be to implement bucketed cache, ie use prefix directories.
Garbage collection: darcs only knows how to clean up the
_darcs/pristine.hashed directory. This directory contains the recorded state of the working copy. We should extend garbage collecion to the patches and inventories of repositories. As of now, the only way to clean them is to do a new repository clone. See Using/GrowingInventoriesProblem and
- (re)implement bucketed cache: a long lost piece of code from 2010 that never made it into darcs but should have (
- implement garbage collection for
- implement garbage collection for
- investigate and implement global cache garbage collection (
darcs undo: go to some earlier alternative version of the current repository
darcs undelete: scan inventory files of cache to resurrect deleted repositories
Dive into the fantastic world of hashed files, filesystem and hard links! Discover that having 100.000 files in the same directory may not be the greatest idea ever! Linus Torvalds knew it from the beginning and never told us!
Proposed by: Guillaume
In their representation on the filesystem, patches of a repository are linearly ordered. The command
optimize --reorder reorders patches so that untagged patches are moved to the “front” of this order. How could such untagged patches arrive there? Well this happens when you pull tags from a remote repository.
Such reordering reduces the amount that a typical remote command needs to download. It also reduces the CPU time needed for some operations (which ones?). But it requires some calculation on its own. That’s why we don’t do it all the time.
The current behaviour of the
optimize --reorder algorithm is not yet completely understood. For instance the command is not idempotent in certain cases. On some repositories it is abnormally slow.
- understand the
- study wether patch reordering should be idempotent (normal form for repository inventories?)
- collect hand-designed and real-life test repositories to measure patch reordering performance
- improve patch reordering performance
darcs send --minimize-contextie, the ability to create patch bundles with as few patches as possible in the context. The implementation will involve some heuristic vs exact thinking, since this task may be computationally costful.
This is problably the most “hardcore” project since it involves diving into the patch code of darcs. Sources say it may not be the friendliest piece of code ever. But this is also the most interesting and specific project, since first-class patches are what makes darcs unique among the other revision control systems. Along with the code, we will insist on having good documentation of what is going on.
Proposed by: Florent, Ganesh, Guillaume, Owen
- show on
recordon which changes do the unrecorded changes sit
- automatically discover patch dependencies (
amend --ask-deps) when given test fails without them
- Unresolved conflicts-related issues in the bug tracker
The global cache could be used to easily go back to a previous state of a repository. We could think of a
darcs undo command that proposes this. Hence history-changing commands like
obliterate could be easily undoable without having to think about it before (as when one uses
darcs resurrect, scans the global cache for the most recents inventory files and proposes to rebuild repositories from them.
Keep in mind that you could always propose an project with a whole different set of ideas. Be creative! :-)
Other project ideas:
- Add a darcs support to an existing GUI, for instance
Sketch out an idea. Can you make Darcs faster? Can you make it more useful? It would make sense to get in touch with firstname.lastname@example.org for some help.
Check out the student guide to know what you’re getting into
Get in touch with the Darcs team if you have not done so already
Write up your proposal (this should take a day or two). See the previous applications if you’re having trouble getting started.
Submit your application to the GSoC website Register as a student first then submit your application.
- Patch Index (BSRK Aditya)
2009 - Hashed storage (Petr Rockai)
- Petr’s application (Slightly post-edited after acceptance)
2007 - Darcs 2 research (Jason Dagit)