Hosting a Git Repository in Windows

On February 20, 2012, in Uncategorized, by derekgreer

If you’re working in a Windows-only environment and you’d like to host a git repository, this article will walk you through three different approaches: Shared File System Hosting, Git Protocol Hosting, and SSH Hosting.

Setting Up Your Repository

Before choosing which hosting strategy you’d like to use, you’ll first need a git repository to share.  The following steps will walk you through setting up an empty git repository.

Step 1 – Install Git

The most popular way to install git on Windows is by installing msysGit.  The msysGit distribution of git includes a minimalist set of GNU utilities along with the git binaries.  Download the msysGit installer and execute it.  When prompted to select components on the forth dialog, enable the Windows Explorer Integration components: ‘Git Bash Here’ and ‘Git GUI Here’.  When prompted to configure the line-ending conversions, select ‘checkout as-is, commit as-is’ if your team will only ever be working on projects from a Windows machine.

For reference, here are the dialogs presented by the wizard:

msysgit-install-1-3_thumb2

msysgit-install-4-6_thumb2

msysgit-install-7-9_thumb3

 

Step 2 – Create a Bare Repository

With this step, we’ll be creating an empty repository for a sample project named ‘sample’.  Since we’ll only be using this repository remotely, we’ll be initializing a folder as a ‘bare’ repository, which means that git will place all the files git needs to manage the repository directly within the initialized folder rather than in a ‘.git’ sub-folder.  To create the repository, follow these steps:

  • Using Windows Explorer, create a folder somewhere on the server where you plan to store all of your repositories (e.g. C:\git\).
  • Within the chosen repositories parent folder, create a sub-folder named ‘example.git’.
  • From the folder pane of Windows Explorer, right click on the ‘example’ folder and select ‘Git Bash Here’.  This will open up a bash shell.
  • At the command prompt, issue the following command:
$> git init --bare

Note

The ‘example’ repository can be created using your own credentials, but if you’d like to use the ‘Single User’ strategy presented in the SSH Hosting section, I’d recommend creating a ‘git’ Windows account and creating the repository with that user.

 

You now have a git repository for the ‘example’ project and are ready to select a strategy for sharing this repository with your team.

Shared File System Hosting

Hosting a git repository via a shared file system is by far the easiest way to get started sharing a git repository.  Simply share the main git folder where you created the ‘example.git’ repository and grant full control of the contents.  Once you’ve set up the desired level of access, users can clone the shared repository using the following command:

$> git clone \\[repository server name]\[share name]\example.git

Git Protocol Hosting

Git supplies its own ‘git’ protocol for simple hosting purposes.  To host an existing repository using the git protocol, you can simply issue the following command from the hosting server:

$> git daemon --base-path=C:/git --export-all

This will host all repositories located in the folder C:/git as read-only.  To allow push access, add –enable=receive-pack:

$> git daemon --base-path=C:/git --export-all --enable=receive-pack

This is fine for situations where you want to temporarily share a repository with a co-worker (e.g. when working on a feature branch together), but if you want to permanently share remote repositories then you’ll need to start the git daemon as a Windows service.

Unfortunately, it doesn’t seem Microsoft makes installing an arbitrary command as a service very easy.  Windows does provide the Service Control utility (e.g. sc.exe), but this only allows you to work with Windows service applications.  To resolve this problem, a simple .Net Windows Service application can be created which performs a Process.Start() on a supplied service argument. 

For the purposes of this guide, I’ve created just such a service which you can obtain from https://github.com/derekgreer/serviceRunner.  Once you have the service compiled, create a batch file named ‘gitd.bat’ with the following contents:

"C:\Program Files (x86)\Git\bin\git.exe" daemon --reuseaddr --base-path=C:\git --export-all --verbose --enable=receive-pack

You can then register the gitd.bat file as a service using the Service Control utility by issuing the following command (from an elevated prompt):

sc.exe create "Git Daemon" binpath= "C:\Utils\ServiceRunner.exe C:\Utils\git-daemon.bat" start= auto

To start the service, issue the following command:

sc.exe start "Git Daemon"

After successfully starting the Git Daemon service, you should be able to issue the following command:

git clone git://localhost/example.git

Registering Git Daemon With Cygwin

As an alternative to using the Service Control utility, you can also install any application as a service using Cygwin’s cygrunsrv command.  If you already have Cygwin installed or you’re planning on using Cygwin to host git via SSH then this is the option I’d recommend.  After installing Cygwin with the cygrunsrv package, follow these steps to register the git daemon command as a service:

Step 1: Open a bash shell

Step 2: In a directory where you want to store your daemon scripts (e.g. /cygdrive/c/Cygwin/usr/bin/), create a file named “gitd” with the following content:

#!/bin/bash

c:/Program \Files/Git/git daemon --reuseaddr                 \
                                 --base-path=/cygdrive/c/git \
                                 --export-all                \
                                 --verbose                   \
                                 --enable=receive-pack

Step 3: Run the following cygrunsrv command to install the script as a service (Note: assumes Cygwin is installed at C:\Cygwin):

cygrunsrv   --install gitd                          \
            --path c:/cygwin/bin/bash.exe           \
            --args c:/cygwin/usr/bin/gitd           \
            --desc "Git Daemon"                     \
            --neverexits                            \
            --shutdown

Step 4: Run the following command to start the service:

cygrunsrv --start gitd

SSH Hosting

Our final approach to be discussed is the hosting of git repositories via SSH.  SSH (which stands for Secure Shell) is a protocol commonly used by Unix-like systems for transporting information securely between networked systems.  To host git repositories via SSH, you’ll need to run an SSH server process on the machine hosting your git repositories.  To start, install Cygwin with the openssh package selected. 

Once you have Cygwin installed, follow these steps:

Step 1 – Open a bash shell

Step 2 – Run the following command to configure ssh:

ssh-host-config -y

For later versions of Windows, this script may require that a privileged user be created in order to run with elevated privileges.  By default, this user will be named cyg_server.  When prompted, enter a password for the cyg_server user which meets the password policies for your server.

Step 3 – Run the following command to start the sshd service:

net start sshd

You should see the following output:

The CYGWIN sshd service is starting.
The CYGWIN sshd service was started successfully.

The sshd service should now be up and running on your server.  The next step is to choose a strategy for allowing users to connect via SSH.

User Management Strategy

There are two primary strategies for managing how users connect to the SSH server.  The first is to configure each user individually.  The second is to use a single user for your entire team.  Let’s look at each.

Individual User Management

SSH allows users configured in the /etc/passwd file to connect via SSH.  Cygwin automatically creates this file when it first installs.  To add additional users (e.g. the user ‘bob’), you can append new records to the passwd file by issuing the following command from a bash shell:

mkpasswd -l | grep ‘^bob’ >> /etc/passwd

This command adds an entry for a user ‘bob’ by doing the following:

  • run the mkpasswd command for all local users
  • select only entries starting with the string ‘bob’
  • append the entry to the /etc/passwd file

To see the new entry, you can ‘cat’ the contents of the file to the screen by issuing the following command:

cat /etc/passwd

This should show entries that look like the following:

SYSTEM:*:18:544:,S-1-5-18::
LocalService:*:19:544:U-NT AUTHORITY\LocalService,S-1-5-19::
NetworkService:*:20:544:U-NT AUTHORITY\NetworkService,S-1-5-20::
Administrators:*:544:544:,S-1-5-32-544::
Administrator:unused:500:513:DevMachine\Administrator,S-1-5-21-2747539007-3005349326-118100678-500:/home/Administrator:/bin/bash
Guest:unused:501:513:DevMachine\Guest,S-1-5-21-2747539007-3005349326-118100678-501:/home/Guest:/bin/bash
bob:unused:1026:513:git,DevMachine\bob,S-1-5-21-2747539007-3005349326-118100678-1026:/home/git:/bin/bash

By adding the new entry to the /etc/password file, the user ‘bob’ will now have access to access git repositories via SSH.  To clone a repository, Bob would simply need to issue the following command:

git clone ssh://DevMachine/git/example.git

At this point, Bob would have to enter his password as it is set on the server.  I’ll show you how to avoid typing the password a little later.

One downside of this setup, however, is that the user ‘bob’ will also have access to shell into the machine by entering the following command:

ssh DevMachine

If the user bob already has an account on the machine, this probably isn’t an issue.  However, not all users in the /etc/password file are necessarily accounts on the local server.  You can also add network users by issuing the following command:

mkpasswd -d [domain name] -u [user name] >> /etc/passwd

In this case, you may want to restrict what the user ‘bob’ can do to just issuing git commands.  You can do this by changing bob’s default shell entry in the /etc/passwd file from /bin/bash to /usr/bin/git-shell.  The git-shell is a special shell which restricts access to just a few git commands.  Trying to ssh into a server where the shell is set to git-shell will print an error message.

Single User

Another strategy that is a bit easier to manage is to use a single user for all team members.  At first, this might seem like you’re just throwing up your hands and opening up access to everyone, but this isn’t actually the case as we’ll see shortly.

To use this strategy, follow these steps:

Step 1 – Global User Creation

First create a new user to be used for git access.  I’ll call the user ‘git’ for our example. 

Step 2 – Configure SSH Access

Now, configure the ‘git’ user to have access to shell into the server:

mkpasswd -l | grep ‘^git’ >> /etc/passwd

Step 3 – Change Shell

Optional, but it’s a good idea to set the git user’s shell to the git-shell.

You should now be able to use the git user to access any git repositories on the server.  If you test that out at this point, you’ll be prompted for the ‘git’ user’s password.

What we want to do at this point is configure users to have access to ssh as the ‘git’ account without having to use the password.  To do this, we’ll need to set up an ssh public/private key pair for each user.  Let’s use bob as our example.  I’ll be using Cygwin’s openssh as our example, but the concepts are the same if using Putty as your SSH client.

To setup bob’s ssh keys, he’ll need to run the following command:

ssh-keygen

This will prompt the user bob for the location to store the ssh keys.  Hitting enter will accept the defaults. The command will further prompt him for a passphrase.  A passphrase is recommended because it keeps anyone who has access to bob’s machine from getting his private key and impersonating him, but for now let’s just have him hit enter when prompted.  Entering an empty passphrase will cause ssh to bypass asking for anything.  I’ll talk about ways of using a passphrase without having to enter it every time in a bit.

Here’s the kind of output you’ll see:

Generating public/private rsa key pair.
Enter file in which to save the key (/home/bob/.ssh/id_rsa):
Created directory '/home/bob/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/bob/.ssh/id_rsa.
Your public key has been saved in /home/bob/.ssh/id_rsa.pub.
The key fingerprint is:
53:6c:19:ec:c7:96:c5:ee:1d:b0:93:84:b6:8c:a5:2d bob@bobsMachine
The key's randomart image is:
+--[ RSA 2048]----+
|         .. ..   |
|         ..* oo  |
|         .%.o++  |
|         E.+=+.. |
|        S .o ....|
|         .    . .|
|                 |
|                 |
|                 |
+-----------------+

From here, Bob will have a newly created .ssh directory in his home directory:

.ssh $> ls

id_rsa  id_rsa.pub

What we as the admin of the git server need from Bob is the contents of his id_rsa.pub file.  That should look something like the following:

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDRVsZqNvQNc9YBcLCrm2pZotUiFsyZsQtdhWdtVF3u3PHR1ZNcGvWqSSI+hrb7HP/mTFBzyciO1nWRfbERXjexLd5uBf+ou5ZHDs51JIGQs61Lb+Kq/Q8P2/77bqGIIF5cZPfewZM/wQYHiR/JhIWHCRRmVOwPgPkfI7cqKOpbFRqyRYuV0pglsQEYrjm4FCM2MJ4iWnLKdgqj6vCJbNT6ydx4LqqNH9fCcbOphueoETgiBeUQ9U64OsEhlek9trKAQ0pBSNkJzbslbqzLgcJIitX4OYTxau3l74W/kamWeLe5+6M2CUUO826R9j4XuGQ2qqo5A5GrdVSZffuqRtX1 bob@bobMachine

Next, we want to add this key to a special file named authorized_users in the remote ‘git’ user’s .ssh folder.

[DevMachine]: .ssh > $ cat authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC36qnox4nlTInc1fyOlaUC3hJhEdVM4/qKeEKPBJ520sOzJG+cRvRGNSdbtLNKD9xZs0dpiql9Vtgy9Yc2XI+lWjBGUmPbqWUuP8IZdFGx3QwPSIx9YzakuUBqYE5+9JKcBuHIhIlilqCCzDtXop6Bi1lN0ffV5r6PyqyFIv0L7MJb8jDsHX7GRl4IGu8ScxfY4G0PS3ZrMGfQBr2fm8KFzg7XWVaP/HTT4XKcf5Jp6oHvLz8FvEfdZdajyFUXRzrE0Kt9KAbeIBJV8+usiTAVpsmMY1yfrsuBUdOlhpvL/pU2o5B6K8VlJeXSF4IYEgS+v6JBAlyaWQkXupQr+lIL bob@BobMachine

That’s it.  The user ‘bob’ will now be able to ssh without providing a password.

SSH Without Passphrases

Remember, it’s recommended that everyone use a passphrase.  This prevents someone who’s gained access to your machine from tricking out the remote server to letting them have access as you.  To use a passphrase, repeat the above steps with a a passphrase set.  To keep us from having to always type in the passphrase, we can use ssh-agent (or the Pageant process for Putty users).  The basic premise is that you type your passphrase once per session and the ssh-agent will keep track of the passphrase for you. 

To use the ssh-agent to track your passphrase, add the following to your ~/.bash_profile:

SSHAGENT=/usr/bin/ssh-agent
SSHAGENTARGS="-s"
if [ -z "$SSH_AUTH_SOCK" -a -x "$SSHAGENT" ]; then
	eval `$SSHAGENT $SSHAGENTARGS`
	trap "kill $SSH_AGENT_PID" 0
fi

Close your bash shell and reopen it.  You should see an agent pid output when you open your shell again.

Agent pid 1480
[DevMachine]: >

Now, issue the following command:

ssh-add ~/.ssh/id_rsa

Type your passphrase when prompted.  Thereafter, you will be able to ssh without providing a passphrase.

That concludes the guide.  Enjoy!




						
Tagged with:  

JavaScript Closures Explained

On February 17, 2012, in Uncategorized, by derekgreer

If you write any code in JavaScript then you’ve probably used closures, but do you actually understand what they are and how they work?  Taking the time to understand closures and how they’re implemented can add a deeper dimension to your understanding of the JavaScript language.  In this article, I’ll discuss what closures are and how they’re specified for the JavaScript language.

What Are Closures?

A closure is a pairing of a function along with its referencing environment such that identifiers within the function may refer to variables declared within the referencing environment.

Let’s consider the following example:

var createGreeting = function(greeting) {
    return function(name) {
        document.write(greeting + ', ' + name + '.');
    };
};


helloGreeting = createGreeting("Hello");
howdyGreeting = createGreeting("Howdy");

helloGreeting("John");  // Hello, John.
helloGreeting("Sally"); // Hello, Sally.
howdyGreeting("John");  // Howdy, John.
howdyGreeting("Sally"); // Howdy, Sally.

In this code, a function named createGreeting is defined which returns an anonymous function.  When the anonymous function is executed, it prints a greeting which consists of the outer function’s greeting parameter along with the inner function’s name parameter.  While the greeting parameter is neither a formal parameter  nor a local variable of the inner function (which is to say it is a free variable), it is still resolved when the function executes.  Moreover, the createGreeting function object is no longer in scope and may have even been garbage collected at the point the function executes.  How then does this work?

The inner function is capable of resolving the greeting identifier due to a closure which has been formed for the inner function.  That is to say, the inner function has been paired with its referencing (as opposed to its calling) environment.

Conceptually, we can think of our closure as the marriage between a function and the environment in which it was declared within:

closure2

 

Exactly what this marriage looks like at an implementation level differs depending on the language.  In C# for instance, closures are implemented by the instantiation of a compiler-generated class which encapsulates a delegate and the variables referenced by the delegate from its declaring scope. In JavaScript, the function object itself contains a non-accessible property pointing to an object containing the variables from its declaring scope.  While each implementation differs, both render a function that has access to what its environment looked like at the point it was created.

Let’s move on to examining how JavaScript closures work from a specification perspective.

JavaScript Closures

To understand how closures work in JavaScript, it helps to have a grasp of the underlying concepts set forth within the ECMAScript Language Specification.  I’ll be using Edition 5.1 (ECMA-262) of the specification as the basis for the following discussion.

Execution Contexts

When a JavaScript function is executed, a construct referred to as an Execution Context is created.  The Execution Context is an abstract concept prescribed by the specification to track the execution progress of its associated code.  As an application runs, an initial Global Execution Context is created.  As each new function is created, new Execution Contexts are created which form an Execution Context stack.

There are three primary components prescribed for the Execution Context:

    1. The LexicalEnvironment

    2. The VariableEnvironment

    3. The ThisBinding

Only the LexicalEnvironment and VariableEnvironment components are relevant to the topic of closures, so I’ll exclude discussion of the ThisBinding.  (For information about how the ThisBinding is used, see sections 10.4.1.1, 10.4.2, and 10.4.3 of the ECMAScript Lanauage Specification.)

LexicalEnvironment

The LexicalEnvironment is used to resolve identifier references made by code associated with the execution context.  Conceptually, we can think of the LexicalEnvironment as an object containing the variables and formal parameters declared within the code associated by the current Execution Context.

A LexicalEnvironement itself is comprised of two components: An Environment Record, which is used to store identifier bindings for the Execution Context, and an outer reference which points to a LexicalEnvironment of the declaring Execution Context (which is null in the case of the Global Execution Context’s LexicalEnvironment outer reference).  This forms a chain of LexicalEnvironments, each maintaining a reference to the outer scope’s environment:

 

LexicalEnvironmentChain

 

VariableEnvironment

The VariableEnvironment is used to record the bindings created by  variables and function declarations within an execution context.  Upon reading that description, you may be thinking: “but I thought the LexicalEnvironment held the bindings for the current execution context”.  Technically, the VariableEnvironment contains the bindings of the variables and function declarations defined within an Execution Context and the LexicalEnvironment is used to resolve the bindings within the Execution Context.  Confused?  The answer to this seemingly incongruous approach is that, in most cases, the LexicalEnvironment and VariableEnvironment are references to the same entity.

The LexicalEnvironment and VariableEnvironment components are actually references to a LexicalEnvironment type.  When the JavaScript interpreter enters the code for a function, a new LexicalEnvironment instance is created and is assigned to both the LexicalEnvironment and VariableEnvironment references.  The variables and function declarations are then recorded to the VariableEnvironment reference.  When the LexicalEnvironment is used to resolve an identifier, any bindings created through the VariableEnvironment reference are available for resolution.

Identifier Resolution

As previously discussed, LexicalEnvironments have an outer reference which, except for the Global Execution Context’s LexicalEnvironment, points to a LexicalEnvironment record of the declaring Execution Context (i.e. a function block’s “parent” scope).  Functions contain an internal scope property (denoted as [[Scope]] by the ECMAScript specification) which is assigned a LexicalEnvironment from the declaring context.  In the case of function expressions (e.g. var doStuff = function() { …}), the [[Scope]] property is assigned to the declaring context’s LexicalEnvironment.  In the case of function declarations (e.g. function doStuff() { … }), the [[Scope]] property is assigned to the declaring context’s VariableEnvironment.  We’ll discuss the reasons for this disparity shortly, but for now let’s just focus on the fact that the function has a [[Scope]] which points to the environment in which is was created.  This is our pairing of a function with its referencing environment, which is to say, this is our closure. 

If we recall our previous conceptualization of a function paired with its environment, the JavaScript version of this conceptualization would look a little like this:

 

closure3

When resolving an identifier, the current LexicalEnvironment is passed to an abstract operation named GetIdentiferReference which checks the current LexicalEnvironment’s Environment Record for the requested identifier and if not found calls itself recursively with the LexicalEnvironment’s outer reference.  Each link of the chain is checked until the top of the chain is reached (the LexicalEnvironment of the Global Context) in which case the binding is resolved or a reference of undefined is returned.

A Distinction Without a Difference … Usually

As mentioned, the LexicalEnvironment and the VariableEnvironement references point to the same LexicalEnvironment instance in most cases.  The reason for maintaining these as separate references is to support the with statement.

The with statement facilitates block scope by using a supplied object as the Environment Record for a newly created LexicalEnvironment.  Consider the following example:

var x = {
    a: 1
};

var doSomething = function() {
    var a = 2;
    
    with(x) {
        console.log(a);
    };
};
doSomething(); // 1

In this code, while the doSomething function assigns the variable a the value of 2, the console.log function logs the value of 1 instead of 2.  What is happening here is that, for the enclosing block of code, the with statement is inserting a new LexicalEnvironment at the front of the chain with an Environment Record set to the object x.  When the code within the with statement executes, the first LexicalEnvironment to be checked will be the one created by the with statement whose Environment Record contains a binding of x with the value of 1.  Once the with statement exits, the LexicalEnvironment of the current Execution Context is restored to its previous state.

According to the ECMAScript specification, function expressions may be declared within a with statement, but not function declarations (though most implementations allow it with varied behavior).  Since function expressions may be declared, their declaration should form a closure based upon the declaring environment’s LexicalEnvironment because it is the LexicalEnvironement which might be changed by a with statement.  This is the reason why Execution Contexts are prescribed  two different LexicalEnvironment references and why function declarations and function expressions differ in which reference their [[Scope]] property is assigned upon function creation.

Conclusion

This concludes our exploration of closures in JavaScript.  While understanding the ECMAScript Language Specification in detail certainly isn’t necessary to have a working knowledge of closures, I find taking a peek under the covers now and again helps to broaden one’s understanding about a language and its concepts.

Tagged with: