Internals/OptimizeHTTP
Development
- Benchmarks repositories TODO:
- small pack without new patches
- small pack with a few new patches
- small pack with many new patches
- big pack without new patches
- big pack with a few new patches
- big pack with many new patches Then compare times of
get --no-packs --lazy,get --packs --lazy,get --no-packs --complete,get --packs --complete.
Benchmarks with existing big repositories
- Xmonad: ~ 1200 patches
http://code.haskell.org/xmonad - DDC: ~ 4500 patches
http://code.ouroborus.net/ddc/ddc-head - Agda: ~ 4600 patches
http://code.haskell.org/Agda - Darcs: ~ 11000 patches
http://darcs.net - Nikki and the robots: ~ 2000 patches
http://code.joyridelabs.de/nikki - : ~ 7500 patches
http://darcs.net/darcs-wiki
Optimizing a repository for HTTP transfer
To reduce number of files needed to transfer over network, the optimize --http command packs a repository into two tarballs, basic.tar.gz and patches.tar.gz, with the following content:
basic.tar.gz
- _darcs/hashed_inventory
- _darcs/meta-filelist-pristine
- _darcs/meta-filelist-inventories
- _darcs/meta-*
- _darcs/hashed.pristine/*
- _darcs/inventories/*
meta-filelist-* files contain directory listings for hashed.pristine and inventories dirs, in reverse order wrt tarball itself. While getting, files from this listings are downloaded using cache in parallel with tarball.
meta-* files in general contain additional files and information that could extend the tarballs functionality in some way. They are expected to have a small size, so that negative effect on performance would be minimal.
patches.tar.gz
- _darcs/patches/*
Getting an optimized repository
- Download and unpack basic.tar.gz. Result: lazy repository from time when
optimize --httphas been done. - Pull from parent repository. Result: lazy repository from current time.
- Download and unpack patches.tar.gz. Result: full repository.
Benchmarks from 2011
How does optimize --http improve the user experience?
- Jérémie’s repo (~900 patches): from 10s (
get --no-packs) to 1s (get) http://darcs.net/(~9300 patches):darcs optimize --httptakes 14s to run. _darcs goes from 54 MBytes to 64 MBytes (indeed _darcs/packs/ is 11 MBytes) Complete get: from 37 to 2 minutes, lazy get from 27 seconds to 7 seconds.
screened + 12 patches:
. packs no-packs
lazy 30s 1m30
full 2m30s 31m