Tuesday, September 21, 2010

Importing a TFS repository into a Git repository using Powershell

Git is a snapshot based system. It takes a snapshot of your entire tree every time you make a commit. The individual changes between commits can be derived from the state of the snapshots.
This simple system makes it easy to recreate an existing Team Foundation Server repository (or any other version control system) into a new Git repository. If you are able to replay the history from a Team Foundation Server repository, you can use this to create a new history in a Git repository.
Enter TFS2GIT.
TFS2GIT is a Powershell script that reads the history of a TFS repository, replays this history in a workspace and finally adds this to a Git repository.
Requirements:
  1. Powershell
  2. Access to the Team Foundation commandline tools (type ‘tf’ in a commandline)
  3. Access to msysgit (type ‘git –-version’ in a commandline).
To import the history of the repository $/Trunk/ApplicationFoo, use:
.\TFS2GIT.ps1 $/Trunk/ApplicationFoo

Modifying it for your own needs

The script uses a regular expression to filter all the changeset numbers out of the ‘tf history’ command. Because this script originated out of my specific need to import our own TFS repositories the regular expression will not work for your specific situation.
EDIT: The script has been updated and should work for almost all situations now.
The regular expression can be found in the GetChangeSetsFromHistory function.
Our history is as follows (part of the useraccounts / comments removed to protect the innocent).



I used the regular expression \d{1,5}(?=\s{5}BC2SC) to retrieve the changeset numbers. In the future, I’ll update it to a more generic version.
This is my first attempt at using Powershell. Feel free to send me any enhancements through the usual channels at the excellent Github site.

9 comments:

  1. Allen Fienberg9/22/10, 5:03 AM

    Does this peserve the time stamp history in GIT when it's migrated over?

    ReplyDelete
  2. No, it will preserve the entire commit message (which contains the TFS timestamp), but it will have the time stamp of the moment the script ran.

    ReplyDelete
  3. Thanks!
    We had to fix a few things to get it to work for us:
    - We do an iterating ReadLine to get all lines individually, because we...
    - changed the match to "\d+", which gets the decimal in the beginning.
    - Added numbers as integers (the original script was doing string-based sorting, which works for the always-5-digits history).

    A PowerShell guy could probably figure it out something better, but here's our changes in GetChangesetsFromHistory:
    [...]
    $f = [System.IO.File]::OpenText($HistoryFileName)
    $line = $f.ReadLine();# Throw away headers
    $line = $f.ReadLine();# Throw away -------

    while (! $f.EndOfStream)
    {
    $line = $f.ReadLine();
    $num = [regex]::Match($line, "\d+").Value
    $ChangeSets = $ChangeSets + @([System.Convert]::ToInt32($num));
    }
    $f.Close()
    [...]

    ReplyDelete
  4. I like your changes. Any way you could send them to me as a Git patch? Makes integrating it easier ofcourse ;-)

    ReplyDelete
  5. [...] I also wrote a small article about its usage at http://walkingthestack.wordpress.com/2010/09/21/importing-a-tfs-repository-into-a-git-repository-usi... [...]

    ReplyDelete
  6. Chris, thanks for the changes. I've incorporated them in the latest version of the script. This makes it more generic and should work out of the box for most users.

    ReplyDelete
  7. Currently working on this feature....

    ReplyDelete
  8. Hi There,

    when I run tfs2git I get these startup errors

    TF14061: The workspace TFS2GIT;MABLE\Pat does not exist.
    The path C:\projects is already mapped in workspace AMY.
    TF14061: The workspace TFS2GIT;MABLE\Pat does not exist.
    TF14061: The workspace TFS2GIT;MABLE\Pat does not exist.
    fatal: $home not set

    the script then iterates through 73 tfs changsets putting them into a temp repository and i end with these errors

    Initialized empty Git repository in C:/projects/ConvertedFromTFS/
    warning: You appear to have cloned an empty repository.
    Your converted (bare) repository can be found in the ConvertedFromTFS directory.
    Removing workspace
    TF14061: The workspace TFS2GIT;MABLE\Pat does not exist.
    Removing working directories in C:\Users\Pat\AppData\Local\Temp\workspace
    Done!

    the ConvertedFromTFS folder just contains the contents of a normal .git folder

    any ideas?
    Pat

    ReplyDelete
  9. My guess would be that you are running the script in a directory that is also mapped in a workspace from TFS. Move the script to a directory that is not mapped to a TFS workspace and try again.

    The ConvertedFromTFS folder will always be a regular Git folder. It is a bare repository and does not have a working copy. Try running 'Gitk' when standing in the ConvertedFromTFS directory.

    ReplyDelete