Skip to main content
Code42 Support

Data leak prevention and detection with the Code42 API

Available in:

StandardPremiumEnterprise
Small Business
Applies to:

Overview

This tutorial describes a prototype data leak protection (DLP) security solution in the form of a Code42 APIPython script. The script monitors and protects the archives of selected users in your Code42 environment against unauthorized or suspicious restore activity.

This article explains:

  • The purposes, limitations, and principles of the script.
  • How to configure, install, and implement the script on the Linux platform.
  • How to interpret the output of the script.

Capabilities of the DLP script

The restoreWatch.py script described in this article can monitor one or more users for:

  • Number of restores since the last script run
  • Restores to devices that are not the original source of the data
  • Restores performed by a user who does not own the data
  • Web restores

When the script detects an activity that may lead to a data leak, it can take one of the following actions:

  • None
  • Warn
  • Block

Considerations

  • Basic knowledge of the Python programming language is highly recommended.
  • You must be familiar with basic scripting and the configuration of a job scheduler on your operating system.
  • This article uses the Linux operating system for the example implementation. You are responsible for adapting this solution to your particular environment.
Using the example script in your Code42 environment
The example script in this article is intended for demonstration purposes only. You can use the script as the basis for a DLP solution that is customized for your environment. Our Customer Champions cannot help you customize or troubleshoot the script. For assistance with creating custom solutions, contact your PRO Services representative.

Before you begin

  • This solution requires root or administrator privileges on a server or workstation with the following features:
    • Network access to the authority server in your Code42 environment on port 4280 or port 4285
    • A job scheduler such as cron or Windows Task Scheduler
    • Python (version 2.7 or higher)
  • You must have an on-premises authority server.
  • Confirm that you can access the Code42 API Documentation Viewer in order to reference the built-in API documentation.
  • Familiarize yourself with Code42's API examples on GitHub.

API methods used

The example script in this article uses the following resources of the Code42 API:

  • User
  • UserBlock
  • RestoreRecord

The available methods, as well as information on usage, syntax, and examples, are available through the API Documentation Viewer.

Set up the script for your Code42 environment

Step 1: Download the script from GitHub

The example script, restoreWatch.py, is located on GitHub. You may download the script in one of two ways:

Step 2: Place the script on a test device

The restoreWatch.py script can be run from:

  • A Code42 server (either an authority server or storage server).
  • Any computer, server, or workstation able to communicate securely with the authority server on port 4285 or 4280.
    • Using port 4285 and SSL is recommended for security reasons.
    • Using port 4280 may expose sensitive data.

The restoreWatch.py script can run on any device that has version 2.7 or higher of the Python language.

To install the restoreWatch.py script:

  1. Copy the script to the device and directory where it will run.
  2. Change the permissions of the script to allow execution of the script (for example, using Unix/Linux chmod).
    • The restoreWatch.py script may contain sensitive information such as the username and password of a user with SYSADMIN privileges on your authority server.
    • Set permissions as restrictively as possible to prevent unauthorized access to the file contents.

Step 3: Modify the script by adding parameters for your test environment

Change the values of the parameters below to match your test Code42 environment. Before making changes, make a backup copy of the original script.

Test on sample data
Test this script in a separate test environment or test organization before deploying this script to monitor real users. For example, you might create a test organization named "Restore Watch Test Org," create some test users, and configure the script to monitor these test users. For added safety, set up your test on a virtual machine running a non-production version of the Code42 environment.

Set the parameters described below to values that test the script accurately for your Code42 environment. Each parameter is in the section of the script defined by the headings below:

Admin parameters

Configure the parameters in the ADMIN PARAMETERS section of the script:

Parameter Description
c42_master

Sets the URL for the administration console of your authority server:

  • Use https://master-server.example.com, where <master-server.example.com> is the IP address or fully qualified domain name (FQDN) of the authority server in your Code42 environment
  • You may use either the HTTPS or HTTP protocol, although http is less secure and not recommended.
c42_port

Sets the port for the administration console of your authority server:

  • Use port 4285 for HTTPS.
  • Use port 4280 for HTTP.
c42_admin Specifies the username of a user that has the SYSADMIN role, or a custom role with the admin permission.
c42_password Specifies the password of the user.
c42_admin_email Sets the email address of the administrator or person who should receive the alerts and warnings generated by the script,

Monitored user parameters

Configure the parameters in the MONITORED USER PARAMETERS section of the script:

Parameter Description
USERID

Sets the userId of the user to monitor.

  • To find a user's userId:
    1. From the administration console, go to Users.
    2. Select a user.
    3. Look at the URL in your browser for the userId parameter. For example, [userId=123].
  • If you wish to monitor multiple users, see the section below on monitoring multiple users.
TOO_MANY_RESTORES_ACTION

Sets the action to take if the script detects more than a defined number of restores between runs:

  • NONE
  • WARN
  • BLOCK
TOO_MANY_RESTORES_THRESHOLD

Sets the limit for the number of restores for the user. If the number of detected restores since the last run of the script is higher than this threshold, the action defined by TOO_MANY_RESTORES_ACTION is taken.

NON_ORIGIN_DEVICE_RESTORE_ACTION

Sets the action to take if files or folders are restored to a device that is not the original source of the data:

  • NONE
  • WARN
  • BLOCK
NON_OWNER_RESTORE_ACTION

Sets the action to take if files or folders are restored by a user who is not the owner of the data:

  • NONE
  • WARN
  • BLOCK
WEB_RESTORE_ACTION

Sets the action to take if the files or folders are restored using the web restore feature of the Code42 app:

  • NONE
  • WARN
  • BLOCK
Definition of actions

Note: all action options are case-sensitive. Use all caps when entering these values.

  • NONE: No action is taken, and the event is not stored in csv file. No email is sent.
  • WARN: A warning is sent to the user defined in the c42_admin_email parameter, and optionally to the owner of the archive.
  • BLOCK: The user who performed the restore is blocked. A warning is sent to the user defined in the c42_admin_email parameter, and optionally to the owner of the archive.

Reporting settings

Configure the parameter in the REPORTING SETTINGS section of the script:

Parameter Description

EMAIL_ARCHIVE_OWNER

Determines whether the owner of the files or folders that were restored is sent an email warning (case-sensitive):

  • True
  • False

Miscellaneous parameters

Configure the parameters in the MISC PARAMETERS section of the script:

Parameter Description
DATA_FILE

Sets the file name of the binary file that stores important data used by the restoreWatch.py script between runs.

  • This file is not human readable.
  • If you want to run multiple instances of the script from the same directory, change this parameter so that it is different in each instance of the script. For example: restoreWatchData1, restoreWatchData2, and so on.
CSV_FILE

Sets the file name of the CSV file that records information, actions and warnings about restore events.

  • View this file using any spreadsheet program, such as Excel.
  • If you want to run multiple instances of the script from the same directory, change this parameter so that it is different in each instance of the script. For example: restoreWatch1.csv, restoreWatch2.csv, and so on.

SSL security settings

Configure the parameter in the SSL SECURITY SETTINGS section of the script:

Parameter Description

VERIFY_CERT

Determines whether the script validates your authority server's SSL certificate (case-sensitive):

  • True
  • False (use this option if your authority server uses a self-signed certificate)

Email settings

Configure the parameters in the EMAIL SETTINGS section of the script:

Parameter Description
USE_MAILX

Determines whether the Code42 server sends email warnings using the local mailx mail user agent, if it is installed on the server (case sensitive):

  • True (use mailx on the Code42 server to send emails)
  • False

If USE_MAILX is set to False, then you must configure the parameters below.

MAIL_HOST

Sets the fully qualified domain name (FQDN) of your mail server.

SMTP_USE_SSL

Specifies whether your mail server requires SSL encryption (case-sensitive):

  • True
  • False

SMTP_PORT

Sets the port that your mail server uses, if it is non-standard. Leaving this value set to Default causes the script to use the default SMTP port.

SMTP_REQUIRES_AUTH

Specifies whether your mail server requires authentication to send mail (case-sensitive):

  • True
  • False

SMTP_USER

Sets the username used to authenticate with your mail server.

SMTP_PASS

Sets the password used to authenticate with your mail server.

SMTP_SENDING_USER

  • Sets the email address that you want to appear in the 'from' field of the warning emails.
  • Use SMTP_USER if you want the email to appear to come from the authenticated smtp user defined in SMTP_USER.
Stop editing
Do not modify the script below this point.

Step 4: Test the script

You are now ready to test the restoreWatch.py script from the command line. Running the script from the command line before adding it to your system's job scheduler has the following advantages:

  • You can immediately see any errors generated by the script.
  • Files generated by the script are immediately available for inspection.

Run the script

To test the script from the command line, run the following command in a terminal window while in the same directory as the script:

python restoreWatch.py

Confirm the restorewatchdata file exists

After you run the script for the first time, check to confirm that the restoreWatchData binary file exists in the directory where the restoreWatch.py script is located. The following example using the Linux ls command was run from a terminal window, showing that the restoreWatchData file exists with a recent timestamp:

linuxServer:dataLeakPrevention admin$ ls -al
total 208
drwxr-xr-x   8 admin  root root    272 May 29 14:20 .
drwxr-xr-x  19 admin  root root    646 May 28 20:52 ..
-rwxr-xr-x   1 admin  root root  24081 May 28 20:57 restoreWatch.py
-rw-r--r--   1 admin  root root    891 May 29 14:25 restoreWatchData
linuxServer:dataLeakPrevention admin$ date
Fri May 29 14:26:39 CDT 2015

This confirms that the script ran successfully and created the binary file it uses to save its state between runs.

Trigger an event to receive an email warning and create an CSV file

You can now test the script further by performing a restore that will trigger the script to send alert emails and create the CSV file:

  1. Based on the settings that you configured in Monitored User Parameters, perform one of the following types of restores:

    • A web restore
    • A number of client restores that exceeds the threshold set by the TOO_MANY_RESTORES_THRESHOLD parameter
    • A web restore that is performed by a user who is not the owner of the files
    • A web restore to a device that is not the same device as the source of the files
  2. Run the restoreWatch.py script again from the command line with the command python restoreWatch.py.
  3. Confirm that the restoreWatch.csv file exists in the directory where the restoreWatch.py script is located.
  4. Open the csv file with a spreadsheet program such as Excel to verify that the contents contain the restore events that you used to trigger the script.
  5. Confirm that the alert email was sent to the configured user, and optionally to the monitored user's email address.
    If the warning emails were sent and the csv file was created, this confirms that the script is configured correctly.
  6. (Optional) Delete the restoreWatchData and restoreWatch.csv files to discard the initial internal configuration and test results.

Step 5: Add the script to a job scheduler

Add the restoreWatch.py script to your system's job scheduler.

On a Linux system, use the command crontab -e to add the script to the system or user's crontab file. See the Linux man page for crontab for more information.

The following crontab entry runs the restoreWatchy.py script every minute:

* * * * * python /home/ubuntu/restoreWatch.py

Considerations

The script can be run as often as once per minute by the cron daemon. However, the optimal schedule depends on your Code42 environment:

  • In a Code42 environment with few users, a single authority server, and no storage servers, you may be able to run the script at a rate of once per minute.
  • In a Code42 environment with a large number of monitored users and multiple storage servers, it may be preferable to run the script once or twice per day.
  • If your Code42 environment uses one or more storage servers, then the data on the most recent restore jobs may not be available to the RestoreRecord resource immediately. Data about recent restore events from storage servers is not available to the Code42 API until the next daily maintenance job runs, which may take up to 24 hours. If your Code42 environment uses a provider storage destination, the restore event data may take even longer to be received by the authority server, because a provider sync job must run, followed by a daily maintenance job.

Step 6: Test the script after the job scheduler runs

  1. After the script is run by your system's job scheduler, perform one of the following types of restores based on the settings that you configured in Monitored User Parameters:
    • A web restore
    • A number of client restores that exceeds the threshold set by the TOO_MANY_RESTORES_THRESHOLD parameter
    • A web restore that is performed by a user who is not the owner of the files
    • A web restore to a device that is not the same device as the source of the files
  2. After the next scheduled run of the script, confirm that the file restoreWatch.csv exists in the directory where the restoreWatch.py script is located, or that it was appended with the new data if you did not delete it in Step 4.
    Each run of the script appends to the existing csv file. The script does not create a new csv file during each run.
  3. Open the csv file with a spreadsheet program such as Excel to verify that the contents contain the restore events that you used to trigger the script.
  4. Confirm that the alert email was sent to the configured admin, and optionally to the monitored user's email address.

Modify the script to monitor multiple users

It is possible to configure the restoreWatch.py script to monitor multiple users by editing the native Python data dictionary object declaration.

Back up your script
Make a backup copy of your working restoreWatch.py script before modifying it to monitor additional users.

Step 1: Find the data dictionary object "initial_data"

  1. Using your favorite text editor, such as vim, Notepad, or Emacs, open the restoreWatch.py script.
  2. Navigate to the location in the file where the data dictionary variable initial_data is declared:
## Modify initial_data dictionary object with care.
initial_data={ USERID:{
                              'too_many_restores_action':TOO_MANY_RESTORES_ACTION,
                              'too_many_restores_threshold':TOO_MANY_RESTORES_THRESHOLD,
                              'non_origin_device_restore_action':NON_ORIGIN_DEVICE_RESTORE_ACTION,
                              'non_owner_restore_action':NON_OWNER_RESTORE_ACTION,
                              'web_restore_action':WEB_RESTORE_ACTION }
                              }

Step 2: Append additional users

Add additional users by copying and pasting the first entry in the data dictionary, separating each new entry with a comma. In this example, two additional users to monitor were added, with userIds 1001 and 1002:

## Modify initial_data dictionary object with care.
initial_data={ USERID:{
                              'too_many_restores_action':TOO_MANY_RESTORES_ACTION,
                              'too_many_restores_threshold':TOO_MANY_RESTORES_THRESHOLD,
                              'non_origin_device_restore_action':NON_ORIGIN_DEVICE_RESTORE_ACTION,
                              'non_owner_restore_action':NON_OWNER_RESTORE_ACTION,
                              'web_restore_action':WEB_RESTORE_ACTION },
                1001:{
                              'too_many_restores_action':TOO_MANY_RESTORES_ACTION,
                              'too_many_restores_threshold':TOO_MANY_RESTORES_THRESHOLD,
                              'non_origin_device_restore_action':NON_ORIGIN_DEVICE_RESTORE_ACTION,
                              'non_owner_restore_action':NON_OWNER_RESTORE_ACTION,
                              'web_restore_action':WEB_RESTORE_ACTION },
                1002:{
                              'too_many_restores_action':TOO_MANY_RESTORES_ACTION,
                              'too_many_restores_threshold':TOO_MANY_RESTORES_THRESHOLD,
                              'non_origin_device_restore_action':NON_ORIGIN_DEVICE_RESTORE_ACTION,
                              'non_owner_restore_action':NON_OWNER_RESTORE_ACTION,
                              'web_restore_action':WEB_RESTORE_ACTION }
                              }

Note that the monitoring settings defined under MONITORED USER PARAMETERS apply to all of the users being monitored, although these parameters could also be customized for each user by replacing the constants in all caps with literal values, such as WARN or BLOCK.

Security notes

  • The binary file restoreWatchData contains serialized data dictionaries to store the program state between runs. Although it is not meant to be human readable, it may contain information that you wish to keep secure. The script does not actively set the permissions of this file to anything more strict than the default permissions for the directory.
  • The restoreWatch.py script needs to contain credentials for a user with the SYSADMIN role (or a custom role with the admin permission) to work with your Code42 environment. The file permissions for the script should be as restrictive as possible, and access to the server or workstation that stores the script should be secured.

Resource usage

The restoreWatch.py script calls the Code42 API RestoreRecord and User resources on every run for each user being monitored. Using the script to monitor large numbers of users, or running the script at frequent intervals, could cause performance issues for the script and your Code42 environment. You should test any application of the script in a realistic test environment before deployment.