Archiving Framework
Overview
The implementation consists of some Node.js scripts intended to migrate historical data from a source MSSQL Server to a target MongoDb database. The scripts would be invoked from a Jenkins job which can be triggered manually or based on a schedule.
Getting started
Reach out to the DevOps team to create a MongoDb credential and database.
Reach out to the Security team to permit outbound connection from Jenkins server to your MSSSQL server on (http://10.128.31.193:5050/).
Clone the repository on Bitbucket by running the following command:
$ git clone https://bitbucket.org/vggdev/archivingframework/src/master/
$ git checkout development
Open
config.jsonfile. This is where the configurations related to the database tables to archive are defined. A typical structure of this file is shown below:{ "sourceDatabaseName": "IdentityServer", "targetDatabaseName": "IdentityServer_Archive", "tablesDefinition": [{ "tableName": "LoginRecords", "timestampColumn": "Timestamp", "selectQuery": "SELECT * FROM [IdentityServer].[dbo].[LoginRecords] WHERE [TimeStamp] between @minDate and @maxDate", "deleteQuery": ["DECLARE @Deleted_Rows INT; SET @Deleted_Rows = 1; WHILE (@Deleted_Rows > 0) BEGIN BEGIN TRANSACTION DELETE TOP (1000) [IdentityServer].[dbo].[LoginRecords] WHERE [TimeStamp] between @minDate and @maxDate SET @Deleted_Rows = @@ROWCOUNT; COMMIT TRANSACTION CHECKPOINT END"], "minDateQuery": "SELECT min(Timestamp) as minDate FROM [IdentityServer].[dbo].[LoginRecords]", "monthsFromMinDate": 1 }, { "tableName": "Audit", "timestampColumn": "TimeStamp", "selectQuery": "select * FROM [IdentityServer].[dbo].[Audit] WHERE [TimeStamp] between @minDate and @maxDate", "deleteQuery": ["DECLARE @Deleted_Rows INT; SET @Deleted_Rows = 1; WHILE (@Deleted_Rows > 0) BEGIN BEGIN TRANSACTION DELETE TOP (1000) [IdentityServer].[dbo].[Audit] WHERE [TimeStamp] between @minDate and @maxDate SET @Deleted_Rows = @@ROWCOUNT; COMMIT TRANSACTION CHECKPOINT END"], "minDateQuery": "SELECT min(Timestamp) as minDate FROM [IdentityServer].[dbo].[Audit]", "monthsFromMinDate": 1 } ] }Property
Description
tablesDefinitionThis is where definitions related to the tables being archived are configured
tableNameThe name of the table as it'll appear on the destination database
timestampColumnIt is very important to add the SQL timestamp column on the table. This convention ensures that archiving framework splits the records into the appropriate year, i.e.
Audit_2019,Audit_2020selectQueryWrite the actual select query that extracts data from SQL. The query could be as complex as joining multiple tables into a flat structure. Whatever the result of the flattened data is, ensure that
timeStampColumnvalue equals the value configured above.deleteQueryThis is an array of queries. This gives you the flexibility of deleting records from multiple tables especially in cases where multiple tables are joined to achieve a flattened structure. Note that the delete queries are parameterized with
@minDateand@maxDate. By convention, the framework substitutes the values of theminDateQueryandmonthsFromMinDateto the parameters respectively.minDateQueryThis query should only return the minimum date on a particular table being archived. The alias of the column being returned by convention must be
minDate. This value is used to substitute the@minDateparameter as described above.monthsFromMinDateThis value is used to determine how many months from the
@minDateto archive. After getting the@minDatevalue, the framework adds the number of months specified in order to get an absolute upper and lower date boundaries.Open the
Jenkinsfile. This is where the job's pipeline related configuration are defined. The structure of the file is shown below:def archiveId = UUID.randomUUID().toString() pipeline { agent any tools { nodejs "Nodejs" } environment { ARCHIVE_ID = "${archiveId}" SQL_CONNECTION_STRING = credentials('sso_sql_prod_connection_string') MONGODB_CONNECTION_STRING = credentials('sso_mongo_prod_connection_string') } stages { stage('Setup') { steps { sh 'npm install' } } stage('Initialize') { steps { sh 'node autoArchive.js' } } } }Property
Description
SQL_CONNECTION_STRINGThe connection string is stored as a secret in Jenkins using the secret kind of text. This is in turn injected to the job as an environment variable. The value of this variable is the ID of the credential i.e
sso_sql_prod_connection_stringas shown in the snippet above.MONGODB_CONNECTION_STRINGThe destination Mongodb connection string is stored as a secret in Jenkins using the secret kind of text. This is in turn injected to the job as an environment variable. The value of this variable is the ID of the credential
Changes made to these files should be committed to a different branch.
Clone the job from the Jenkins server (http://10.128.31.193:5050/)
Goto the job configuration page
Scroll down to the repository section and update the repository URL to the one you cloned earlier.
Change branch specifier to the new branch.
Save the job
Go back to the job page and build.
Job may take a while to complete.
In case of any errors while the job is executing, simply build the job again. The framework is designed to automatically resume the last failed job on every execution.
Reach out to the R&D team for further enquiries.
Last updated
Was this helpful?