Refactoring, Preserving History
30 July 2016
At work and in many open source projects there is a common pattern of how to manage your code: everything in separate projects. While this keeps everything separate it does introduce some other challenges:
- Branches or tags (depending on your workflow) must be created in every repository for every release if you are using versioned releases.
Certain code elements may be repeated in each repo (for
setup.pyor the repository's
Certain configuration files may need to be repeated
- Organization patterns for files and naming conventions may begin to vary between repos.
- Moving code between repos or out of a repo destroys the history of the moved files.
We'll be touching on this last item.
A colleague asked me the other day how to preserve the history of some files as they were migrated into another repo. This is something we've done many times. Code accumulates in one repo that's sort of related (or in the "core" repository, an antipattern) and then later needs to be shared with another project. A new project is then created and the files are copied into it. Unfortunately, the history does not travel with them. At the time the question was posed to me, I did not have a good answer. Later that night as I showered, however, I came up with a halfway solution that at least allows history to be preserved when moving code into a brand new repo:
cp -ipr oldrepo newrepo cd newrepo delete-the-unwanted-files
Simply copy the old repo to a new repo and delete the unwanted files. Using mercurial, you would issue:
hg mv oldrepo newrepo hg rm -f newrepo/herp.py
Along with some additional moving and renaming (and
probably the use of
perl -p -i -e) you will have the entire
history of the remaining files.
Of course, you also end up with the history of all the other files that were in the repo as well. For larger, older repos and those with large files, this may represent an undesirable overhead. It's not perfect for all situations but can be useful in a pinch.