Using git with CData Arc

Table of Contents



This article is to serve as a guide on how to setup git version control with CData Arc. This article assumes the reader is familiar with git and has an understanding of how to use git to version control a project.

This article will cover the topics of initialization, resources that should and should not be part of version control, as well as tracking and committing changes.

For more information on git architecture, commands and guides, please visit the official git documentation

Initializing the git Repository

Since Arc is not under version control out of the box, the installation directory of Arc will need to be initialized as a git repository.

In this example, the Arc installation is located at C:\arc on the machine. In order to initialize that directory as a git repository, the following command would need to be run inside that directory:

git init

This creates a new subdirectory named .git that contains all of the necessary repository files &emdash; a git repository skeleton.

Now, existing files, resources and directories can start to be version controlled. We recommend beginning by tracking the required files, creating a gitignore and doing an initial commit. This can be accomplished with a few git add commands that specify files that should be tracked, followed by a git commit - this will be discussed in a later section.

First, the most important step is to create a comprehensive .gitignore file for the newly initialized git repository.

The .gitignore File

The .gitignore file is going to be one of the most crucial steps for effectively using git with Arc. This is because there are a lot of data and resources within the application direcotry that should not be checked into version control. The goal here is to only track necessary data and resources, while ignoring the unnecessary components.

In terms of Arc, there are two categories of resources that you will want to include in your .gitignore file - application directories/resources and connector directories/resources.

Resources to Ignore

The following list is a comprehensive recommendation of resources within Arc that we suggest be ignored by version control.

  • Application Directories - These directories contain information that is either maintained by the application database or information that isn’t essential to version control
    • api
    • locks
    • lib
    • connectors
    • db
    • logs
    • reports
  • Connector Directories and Resources - These directories contain unsent files, received files, copies of sent files and log messages from all connectors within the "workspaces" and "data" directory.
    • Send
    • Receive**
    • Sent
    • Archive
    • Logs
    • Resources/Samples/Backups

**certain connectors (Branch, Copy, EDI, etc) can have an additional Receive folder for certain use cases. For example, Branch Connectors can have a ReceiveElse folder if there is nothing connected to the "else" path represented by the gray arrow. To ignore these folders, as well as the standard Receive, use a wildcard character after Receive to match all other Receive-type folders (i.e. Receive*).

Example .gitignore File

Having discussed the recommended resources to ignore within version control, the below .gitignore file is a full representation of what we recommend using as a baseline:

#exclude the following dirs and resources in the application dir
api/
locks/
lib/
connectors/
db/
logs/
#version control the report.cfg file but ignore the .csv result files
reports/*/*.csv

#exclude the following dirs and resources from the workspaces dir
workspaces/*/*/Send
#exclude all iterations of Receive folders. 
#if you use additional paths for SFTP/FTP Server, be sure to include them here
workspaces/*/*/Receive*
workspaces/*/*/Sent
workspaces/*/*/Archive
workspaces/*/*/Logs
workspaces/*/Resources/Samples
#exclude backup .bak files that are generated by XML Map Connectors
workspaces/*/Resources/*.bak*

#exclude the following dirs and resources from the data dir
data/*/Send
data/*/Receive*
data/*/Sent
data/*/Archive
data/*/Logs
data/*/Resources/*.bak*

Tracking Changes

If Arc is version controlled using the recommendations above, all connector settings, flows, workspaces, profile settings, application settings and users will be tracked by git version control. Now comes the time to stage and commit those files.

The examples shown below will be utilizing git within Visual Studio Code.

NOTE: This has been done using one "master" branch.

Modified Changes

Anytime changes are made to any of the resources listed above, the status of that file in git will change to "modified". For example, if a user were to change a connector setting - this would be represented by a capital "M" on the port.cfg file of the connector that was modified:

Git view of Arc

This can also be seen by issuing a "git status" command which will show the state of the working directory and the staging area:

Using the git status command

Untracked Changes

If a new connector or workspace is created or a resource like a certificate is added, these items will be marked as "untracked" until they are added and committed to the repository. This is represented by a capital "U" next to the new resource:

Untracked changes

In this case, adding a new workspace creates an untracked "MyNewWorkspace" directory and flow.json file

This can also be seen by issuing a "git status" command:

Using the git status command

Staging Changes

Once the working session within Arc is finished, there will likely be a build up of both modified and untracked changes. These will show up when a "git status" command is issued within terminal or PowerShell.

It is good practice to review all of these changes within either a code editor that integrates with git version control, for example the "Source Control" tab for Visual Studio Code, or review the changes using terminal or PowerShell.

Once it has been determined that the changes made are correct, before they can be committed, they must first be pushed to the staging area. To add all changes from all tracked and untracked files, issue the "git add -A" command:

git add -A

After adding changes, it is always a good idea to do another "git status" to just double check and make sure all changes have been staged, and there is nothing left over that is still modified or untracked.

Committing Changes

After verifying that the working session changes have been staged, it is now time to commit those changes to the master branch of the git repository. This will update the repository to reflect the instance of Arc as it currently sits and will allow you to roll back to this spot in the future, should the need arise.

To commit changes, run the commit command and enter a commit message that briefly summarizes the changes that are being committing. For example, if a user adds a Script Connector that performs the task of parsing out some data from an XML file, the commit might look something like this:

git commit -m "adding script connector responsible for parsing token out of input XML"


Ready to get started?

Use Arc's free 30-day trial to start building your own custom workflows today:

Download Now