JS monorepos in prod 5: merging Git repositories and protect commit historical past

At Adaltas, we preserve a number of open-source Node.js tasks organized as Git monorepos and printed on NPM. We shared our expertise to work with Lerna monorepos in a set of articles:
Now could be the flip of our in style open-source Node CSV challenge to be migrated to a monorepo. This text will stroll you thru the obtainable approaches, technics, and instruments used emigrate a number of Node.js tasks hosted on GitHub into the Lerna monorepo. On the finish, we offer a bash script we used for migrating the Node CSV challenge. This script might be utilized to a unique challenge with just a bit modification.
Necessities for migration
The Node CSV challenge combines 4 NPM packages to work with CSV recordsdata in Node.js wrapped by the umbrella csv
package deal. Every NPM package deal has its wealthy commit historical past, and we wished to save lots of the utmost data from the outdated repositories. There are our necessities for migration:
- protect commit historical past with most data (equivalent to tags, its messages, and merging commits)
- ameliorate commit messages to observe the Standard Commits specification
- protect GitHub points
Monorepo construction
Nicely, we’ve got 5 NPM packages emigrate to the Lerna monorepo:
We need to obtain a listing construction that appears like this:
packages/
csv/
csv-generate/
csv-parse/
csv-stringify/
stream-transform/
lerna.json
package deal.json
Selecting Git log technique
When migrating repositories right into a monorepo, you merge their commit logs. There are 3 instructed methods within the picture beneath.
- Single department
It gives an easy log containing solely commits on the default (grasp) branches of all packages. Completely different logs are joined sequentially by including the newest commit of the earlier package deal as a mother or father decide to the primary commit of the following package deal. This technique breaks the sorting of the log by the date of commits. - A number of branches with a standard mother or father
This improves the visible notion of the log by splitting branches of various repositories. A brand new mother or father commit is added to all the primary commits of the branches. Ultimately, all of the branches are merged into the default department. - A number of branches with totally different mother and father
This technique doesn’t rewrite the primary commits of outdated repositories. It requires minimal intervention into commit historical past and appears logically extra right as a result of initially, the repositories didn’t have a standard mother or father.
Merging commit logs
Lerna has a built-in mechanism for gathering current standalone NPM packages right into a monorepo preserving commit historical past. The lerna import
command imports a package deal from an exterior repository into packages/
. The sequence of instructions is fairly easy: it’s worthwhile to initialize Git and Lerna repositories, make the primary commit, after which begin importing packages from domestically cloned Git repositories. You could find fundamental utilization directions within the documentation right here.
Utilizing lerna import
, you possibly can solely observe the first or the 2nd Git log technique described above. For the 2nd one, it’s worthwhile to create a separate department per importing repository like this:
git checkout -b package-1
lerna import /path/to/package-1
git checkout grasp
git checkout -b package-2
lerna import /path/to/package-2
lerna import
gives an easy-to-use software emigrate repositories to the Lerna monorepo. Nevertheless, it flattens the commit historical past lowering merge commits, and it doesn’t migrate tags and their messages. Sadly, these limitations didn’t meet our requirement to save lots of most data from current repositories and we had to make use of a unique software.
The native git merge
command gives merging unrelated histories utilizing the --allow-unrelated-histories
choice. It preserves the total commit historical past of a focused department with its tags. On this case, you’ll obtain the third Git log technique.
Merging a commit historical past of an exterior repository right into a present one utilizing --allow-unrelated-histories
so simple as working 2 instructions:
git distant add -f <external-repo-name> <external-repo-path>
git merge --allow-unrelated-histories <external-repo-name>/<branch-name>
Rewriting commit messages
To place extra order and transparency into the mixed commit log, we prefix all commit messages with their package deal names. Moreover, we make them appropriate with the Standard Commits specification which we observe in our newest tasks. This specification standardizes the commit messages making them extra readable and straightforward to automate.
To implement this, we have to rewrite all commit messages by prefixing them with the string like chore(
.
We selected the
chore
sort simply to make it appropriate with the specification, and we didn’t need to make advanced common expressions to totally help it.
There are 2 instruments to rewrite commit messages:
Following the Git advice, we select the git filter-repo
. After putting in the software utilizing these directions, the command to rewrite the commit messages of a present repository is:
git filter-repo --message-callback 'return b"chore(): " + message'
To see extra utilization examples of rewriting repository historical past with git filter-repo
, you possibly can observe this documentation.
Transferring GitHub points
After migrating repositories and publishing a brand new monorepo to GitHub, we need to switch current GitHub points from the outdated repositories. Points might be transferred from one repository to a different utilizing the GitHub interface. You’ll be able to observe this information to be taught the directions.
Sadly, on the time of this writing, there isn’t any risk to make a bulk points switch. Points should be transferred one after the other. However this can provide you an excuse to “neglect” to switch annoying pending points created by the challenge neighborhood;)
What about GitHub pull requests? There shall be a loss and we’ve got to reside with it. A very good factor is that hyperlinks between points written in commentaries and linked pull requests shall be saved due to redirecting.
Migration script
The migration bash script leverages the chosen approaches and instruments described above. It generates the ./node-csv
listing containing the Node CSV challenge recordsdata reorganized as a Lerna monorepo.
#!/bin/sh
set -e
REPOS=(
https://github.com/adaltas/node-csv
https://github.com/adaltas/node-csv-generate
https://github.com/adaltas/node-csv-parse
https://github.com/adaltas/node-csv-stringify
https://github.com/adaltas/node-stream-transform
)
OUTPUT_DIR=node-csv
PACKAGES_DIR=packages
rm -rf $OUTPUT_DIR && mkdir $OUTPUT_DIR && cd $OUTPUT_DIR
git init .
git distant add origin $REPOS[0]
for repo in $REPOS[@]; do
splited=($repo//// )
package deal=$splited[$#splited[@]-1]/node-/
rm -rf $TMPDIR/$package deal && mkdir $TMPDIR/$package deal && git clone $repo $TMPDIR/$package deal
git filter-repo
--source $TMPDIR/$package deal
--target $TMPDIR/$package deal
--message-callback "return b'chore($package deal): ' + message"
git distant add -f $package deal $TMPDIR/$package deal
git merge --allow-unrelated-histories $package deal/grasp -m "chore($package deal): merge department 'grasp' of $repo"
mkdir -p $PACKAGES_DIR/$package deal
recordsdata=$(discover . -maxdepth 1 | egrep -v ^./.git$ | egrep -v ^.$ | egrep -v ^./$PACKAGES_DIR$)
for file in $recordsdata// /[@]; do
mv $file $PACKAGES_DIR/$package deal
carried out
git add .
git commit -m "chore($package deal): transfer all package deal recordsdata to $PACKAGES_DIR/$package deal"
git department init/$package deal $package deal/grasp
carried out
rm $PACKAGES_DIR/**/CONTRIBUTING.md
rm $PACKAGES_DIR/**/CODE_OF_CONDUCT.md
rm -rf $PACKAGES_DIR/**/.github
git add .
git commit -m "chore: take away outdated packages recordsdata"
To run this script, merely create an executable file, for instance with the title migrate.sh
, paste the script’s content material inside it, and run it with the command:
chmod u+x ./migrate.sh
./migrate.sh
Observe! Don’t neglect to put in
git-filter-repo
earlier than working the script.
Notes for every step of the script:
1.
Configure
Configuration variables outline the record of repositories to be migrated, the vacation spot listing of the brand new Lerna monorepo, and the folder for packages inside it. You’ll be able to modify these variables to reuse this script to your challenge.2.
Initialize a brand new repository
We initialize a brand new repository. The primary repository can also be registered because the distantorigin
repository.3.
Migrate repositories3.1.
Get package deal title
It extracts package deal names from their repositories hyperlinks. In our case, the repositories are prefixed withnode-
which we don’t need to preserve.3.2.
Rewrite commit messages by way of a short lived repository
So as to add a prefix to the commits of every package deal utilizing the samplechore(
, we have to make it individually for each repository. That is potential by way of a repository domestically cloned to a short lived folder.): 3.3.
Merge the repository into monorepo
At first, we add a domestically cloned repository as a distant to the monorepo. Then, we merge its commit historical past specifying a merge commit message.3.4.
Transfer repository recordsdata to the packages folder
After merging, the recordsdata of the merged repository seem below the monorepo root listing. Following the construction we need to obtain, we transfer these recordsdata to thepackages
listing and commit it.3.5.
Create a brand new department
The commit historical past is now related to our monorepos by a distant repository. The historical past shall be misplaced if the unique repository is erased. To retailer the historical past within the monorepo, we create a department which monitor the distant repository and prefixed it withinit/
.
4.
Cleanup and take away outdated recordsdata
For the sake of illustration, we clear up some package deal recordsdata which might be outdated due to the migration. A few of these file shall be moved to the repository root listing.
Additional steps
The GIT repository is now prepared and, as such, qualifies as a monorepo. To make it usuable, additionnal recordsdata should be created equivalent to a root package deal.json
file, the lerna.json
configuration file if utilizing Lerna and a README
file. Consult with the primary article of our serie to use the required adjustments and initiliaze your monorepo with Lerna.
Conclusion
Migration of current open-source tasks requires you to be tidy and meticulous as a result of slightly mistake can damage the job of your customers. All of the steps should be rigorously analyzed and nicely examined. On this article, we’ve got lined the scope of labor emigrate a number of Node.js tasks to the Lerna monorepo. We now have thought-about totally different approaches, technics and obtainable instruments to automate the migration on the instance of our Node CSV open-source challenge.