Skip to main content

Instructor, no.

Incydr Professional, Enterprise, Horizon, and Gov F2, yes.

Incydr Basic, Advanced, and Gov F1, yes.

Code42 Support

Initial file metadata collection scan FAQs


When you enable file metadata collection, Code42 scans and indexes all files on user devices. (For Incydr Professional, Enterprise, Horizon, and Gov F2, file metadata collection is enabled automatically, so scanning begins immediately upon deploying the insider risk agent to user devices.)

In addition, when you add a cloud data connection, Code42 scans and inventories all files in your organization's cloud drives that belong to monitored users.

This article answers frequently asked questions about this process, including:


  • Endpoint data source: A single user device running a Code42 agent.
  • Cloud service data source: A corporate cloud service you authorize Code42 to monitor. Examples include Box, Google Drive, and Microsoft OneDrive for Business. This does not include email data sources (such as Gmail and Office 365 email) because there is no initial file scan for email services.
  • Initial ingest: The file scan process that indexes all files on an endpoint data source. This scan is performed by the Code42 agent installed on each device.
  • Initial inventory: The file scan process that inventories all files in a cloud data source. This scan is performed via a direct connection between Code42 and the cloud service and does not involve Code42 agents on user devices. This scan typically begins after a cloud service connection is first authorized and does not need to be complete for Code42 to begin monitoring for user activity in those cloud drives.


What is the initial scan?

There are different scans for endpoints and for cloud data connections.

  • Endpoint data source: The initial ingest scan indexes all files on the device.
    • For Incydr Basic and Advanced and other plans, enabling File Metadata Collection initiates the scan. For Incydr Professional, Enterprise, Horizon, and Gov F2, the scan starts automatically.
    • This ingest creates a record of all files on the device at the time file metadata collection was enabled. As a result, user devices might temporarily use a high percentage of CPU resources. Once the initial scan is complete, Code42 only monitors new and incremental file changes, using significantly fewer CPU resources.
    • If you disable and then re-enable File Metadata Collection, the initial ingest file scan on the device starts over. (In Incydr Professional, Enterprise, Horizon, and Gov F2, there is no option to disable File Metadata Collection.)
  • Cloud service data sources: The initial inventory process scans all files in your organization's cloud drives that belong to monitored users. This scan does not affect user devices at all. Code42 connects directly to the cloud service to capture this data, and simultaneously starts monitoring file activity right away while completing the inventory process.
    Shared libraries in Microsoft are not inventoried, discovered, or monitored
    Code42 can only monitor drives in Microsoft OneDrive. While you can create a shared library within OneDrive, such libraries are actually created as Team Sites in SharePoint. Because Code42 cannot monitor sites in SharePoint, any shared libraries listed in your OneDrive environment are excluded.

How long does the scan take?

  • Endpoint data source: Ingest times vary based on the number and size of files on each device. The processing power of the device and your Code42 CPU usage settings also affect how long it takes to complete. With CPU usage settings of 50% When user is away and 20% When user is present, devices can take several hours to several days to scan every file on the device. (CPU settings are not configurable for insider risk agents.)
  • Cloud service data sources: The length of time it takes for the initial inventory to complete depends on the number of files in the drives that belong to the in-scope users who are being monitored in your environment. The inventory process does not impact and is not a prerequisite to begin monitoring user activity in your cloud storage.
    • For environments that contain hundreds of drives, the initial inventory may take between 24 and 72 hours depending on the number of files in each user's drives. The inventory process can take longer if Code42's connection to the cloud environment is throttled. Throttling may occur for these reasons:
      • Google Drive connections can be throttled based on the number of requests made by both the Code42 service per user drive and by all services in the account as a whole
      • OneDrive connections can be throttled based on all requests (including those from Code42) for the account as a whole
      • Box connections can be throttled based on requests made by Code42 per user drive
    • For environments that contain thousands of drives, the initial inventory completes over a longer period. Typically, drives in larger environments complete the inventory process over these time frames:
      • 60% of total drives complete between 24 and 72 hours
      • 25% of total drives complete between 3 and 5 days
      • 15% of total drives complete between 6 and 10 days

Can I start viewing file activity before the scan is fully complete?

  • Endpoint data sources: Yes. As soon as a file is scanned and indexed, file events for that file are visible in Code42. In addition, file activity that may indicate an exposure risk (such as moving files to removable media, uploading to personal cloud services or email) are given priority over indexing all files on the device and are reported in near real-time.
  • Cloud data sources: Yes. Code42 starts monitoring file activity in your organization's environment right away while scanning and completing an inventory of the files on drives owned by in-scope users. File events for these drives become available in Code42 soon after they occur, even if the inventory of those drives has not completed.

How do I know when the scan is complete?

Endpoint data source

Scan status is visible in Code42 agent logs for each device:

  1. Sign in to the Code42 console.
  2. Select Administration > Devices.
  3. Select a device.
    The device details appear.
  4. Retrieve the logs:
    • Incydr Basic and Advanced and other plans: From the action menu, select Retrieve Logs.
    • Incydr Professional, Enterprise, Horizon, and Gov F2: Select Actions > Retrieve Logs, then click Retrieve Logs.
  5. In the Retrieve log: Email notification dialog, select Yes and enter an email address to receive an email notification, or select No.
  6. Click Apply.
    Retrieving... displays under the Logs URL column of the table. (If the device is offline, Retrieving... displays until the device is online.) When the logs are available, a Download logs link displays.
  7. Click the Download logs link.
    The logs are downloaded in a ZIP file. 
  8. Navigate to the location of the downloaded log archive and open the archive.
  9. Locate and open the service.log.0 file.
  10. Search the service.log.0 file for these strings, which indicate the initial scan is complete:
    • Transitioned FFS ingest state from INITIAL_INGEST to SCAN_SUCCESS
    • Transitioned FFS ingest state from SCAN_SUCCESS to STEADY_STATE
      If the above strings do not appear in the log file, the scan is still in process, or the scan completed long enough ago that the messages exist in an older version of the log file (for example service.log.1 or service.log.2).
      Contact our Technical Support Engineers if you need help determining the scan status for a device.

Cloud service data connections 

In the Code42 console, go to Administration > Integrations > Data Connections and review the Status column.

  • A status of Monitoring indicates that Code42 is currently monitoring the cloud environment for file activity. If you have just authorized the Code42 connection to the cloud storage environment, Code42 simultaneously completes the inventory process.
  • To view more detailed status information, click a row in the Data Sources table to open the details panel for that cloud service. This panel lists the total number of monitored users for which Code42 has discovered drives and is currently monitoring for file activity. For Google Drive, a second section repeats these details for shared or team drives.

Why is a device using more CPU than the max allowed?

Does not apply to insider risk agents

Because the initial scan reviews and indexes all files on a device, user devices might temporarily use a high percentage of CPU resources.

Code42 agent CPU settings apply to the amount of CPU processing time dedicated to Code42, not to total CPU processing capacity. Therefore, if the CPU limit is to 20% (for example), the device's Task Manager or Activity Monitor may report the Code42 agent is using more than 20% of the CPU at a particular point in time. 

The processing time of the CPU is measured in instruction cycles. When you limit CPU use for the Code42 agent to X%, you are specifying that the Code42 agent is allowed use as much of the CPU capacity as it needs for up to X% of the available cycles. For example, if the CPU limit is set to 20%, the Code42 agent can use up to 100% of the CPU 20% of the time. The remaining 80% of the time, the CPU prioritizes other process requests. This allows the Code42 agent to work as efficiently as possible when it requests CPU resources, but limits the overall impact to the device.

For the best mix of performance and speed, set the When user is away, use up to setting to 50% and the When user is present, use up to setting to 20%.

How much memory does Code42 use?

Code42 agents dynamically set memory allocation to use 25% of the physical memory on the device. For example, if the device has 8GB of RAM, the Code42 agent can use up to 2GB.

  • Was this article helpful?