GSoC

Timeline for GSoC 2014

  • 14 feb: Mentoring organization application deadline.
  • 24 feb: List of accepted mentoring organizations published on the Google Summer of Code 2014 site.
  • 10-21 mar: Student application period
  • 7 apr: Mentoring organizations should have requested slots via their profile in Melange by this point.
  • 9 apr: Slot allocations published to mentoring organizations
  • 15 apr: First round of de-duplication checks happens; organizations work together to try to resolve as many duplicates as possible.
  • 21 apr: Accepted student proposals announced on the Google Summer of Code 2014 site. Community Bonding Period.
  • 19 may: Students begin coding for their Google Summer of Code projects
  • 23-27 jun: Mid-term evaluations
  • 11 aug: Suggested ‘pencils down’ date. Take a week to scrub code, write tests, improve documentation, etc.
  • 18 aug: Firm ‘pencils down’ date.
  • 18-22 aug: Final evaluation
  • 25 aug: Final results of Google Summer of Code 2014 announced

Project ideas

Here are some ideas for 2014 Google Summer of Code student projects. Note that these themes are just to get you started. We welcome submissions beyond these initial ideas. Get in touch with us! darcs-users@darcs.net or #darcs on freenode.

All of these projects require good Haskell skills. We trust and appreciate students who have contributed to darcs before applying :-)

1. Hashed files and cache

Proposed by/willing to mentor: Guillaume

Hashed files in darcs follow the same idea as in git repositories or the bittorrent protocol: a file is saved on the disk using its own hash as name. Darcs uses them in many places. However some aspects could be improved.

Cache system: darcs maintains a global cache in ~/.cache/darcs/, that is shared between all repositories of a user. This makes many operations faster and saves disk space by using hard links. However when the cache gets too big, it becomes a problem on its own, since filesystems do not cope well with directories with zillions of files inside. The idea would be to implement bucketed cache, ie use prefix directories.

Garbage collection: darcs only knows how to clean up the _darcs/pristine.hashed directory. This directory contains the recorded state of the working copy. We should extend garbage collecion to the patches and inventories of repositories. As of now, the only way to clean them is to do a new repository clone. See Using/GrowingInventoriesProblem and http://bugs.darcs.net/issue1987. Note that the current behaviour of darcs is to clean reposirories only when the command “optimize” is run http://bugs.darcs.net/issue687, so we will maintain this behaviour.

More uses for the global cache: the global cache store files that correspond to previous states of the repositories of a user. Adding to a regular repository the corresponding information to track its previous states, we could implement a darcs undo command that would enable to undo commands like obliterate, unrecord, amend-record, record, etc.

Going further, we could implement a darcs undelete command that would dig in the global cache and use the filesystem date to bring back to life previously deleted repositories.

See also: Internals/CacheSystem, Internals/HashedPristine, Internals/Hashes, http://en.wikibooks.org/wiki/Understanding_Darcs/Getting_started.

Tasks

2. Optimize optimize --reorder and other patch reordering issues

Proposed by/willing to mentor: Guillaume

In their representation on the filesystem, patches of a repository are linearly ordered. The command optimize --reorder reorders patches so that untagged patches are moved to the “front” of this order. How could such untagged patches arrive there? This happens when you pull tags from a remote repository.

Such reordering reduces the amount that a typical remote command needs to download. It also reduces the CPU time needed for some operations (which ones?). But it requires some calculation on its own. That’s why we don’t do it all the time.

The current behaviour of the optimize --reorder algorithm is not yet completely understood. For instance the command is not idempotent in certain cases. On some repositories it is abnormally slow.

Tasks

3. Better patch dependencies

Proposed by: Florent, Ganesh, Guillaume, Owen

  • show on whatsnew and record on which changes do the unrecorded changes sit
  • automatically discover patch dependencies (amend --ask-deps) when given test fails without them
  • visualization of patch dependencies

4. Better conflicts handling/UI

Proposed by/willing to mentor: Ganesh

5. Use darcs as a conflict-resolution tool for any VCS

Proposed by/Willing to mentor: Ganesh

Resolving conflicts during a merge in any VCS involves (at least implicitly) reconstructing the semantic intent of the changes on each side, and then applying them on top of each other.

Darcs patches are a great way of expressing semantic intent explicitly. Build a tool based on darcs patches where the user can reconstruct the changes for both sides of the conflict as a chain of darcs patches, and then use the darcs merge result to actually resolve the conflict (or at least cut it down). As well as the existing darcs patch types, this offers a lot of scope for adding new types just for the tool, as we won’t need to worry about the usual backwards compatibility concerns - for example it might try to parse the changes as source code. The tool can also try to automatically infer the patches as well as allowing the user to enter them explicitly.

6. Develop darcsden as a local darcs UI

Proposed by/willing to mentor: Ganesh

Darcsden is currently primarily focused on being a multi-user, server-based tool for hosting darcs repositories. However there’s a lot of overlap with local manipulation of a darcs repository on a workstation. Extend/generalise Darcsden so it can be used for both.

7. Distributed issue tracking for darcsden

Proposed by/willing to mentor: Ganesh

Change the darcsden issue tracker to store the issues themselves in a separate darcs repository. Other people have tried distributed issue tracking before so this also involves investigating previous solutions.

8. Other projects

Keep in mind that you could always propose an project with a whole different set of ideas. Be creative! :-)

Other project ideas:

Application process

  1. Sketch out an idea. Can you make Darcs faster? Can you make it more useful? It would make sense to get in touch with darcs-users@darcs.net for some help.

  2. Check out the student guide to know what you’re getting into

  3. Get in touch with the Darcs team if you have not done so already

  4. Write up your proposal (this should take a day or two). See the previous applications if you’re having trouble getting started.

  5. Submit your application to the GSoC website Register as a student first then submit your application.

Older projects

  1. 2013
  2. 2012
  3. 2011

  4. 2010

  5. 2009 - Hashed storage (Petr Rockai)

  6. 2007 - Darcs 2 research (Jason Dagit)

See also