Skip to main content

Who is this article for?

Code42 for EnterpriseSee product plans and features
CrashPlan for Small Business 

CrashPlan for Small Business, no.

Code42 for Enterprise, yes.

Link: Product plans and features.

This article applies to Code42 cloud environments.

Code42 Support

Initial file metadata collection scan FAQs

Who is this article for?

Code42 for EnterpriseSee product plans and features
CrashPlan for Small Business 

CrashPlan for Small Business, no.

Code42 for Enterprise, yes.

Link: Product plans and features.

This article applies to Code42 cloud environments.


When you enable file metadata collection, Code42 scans and indexes all files on endpoints and in any monitored cloud data sources. This article answers frequently asked questions about this process, including:

  • How long does scanning take?
  • When are file events visible in the Code42 console?
  • What is the CPU impact to user devices?


  • Endpoint data source: A single user device running the Code42 app.
  • Cloud data source: A corporate cloud service you authorize Code42 to monitor. Examples include Box, Google Drive, and Microsoft OneDrive for Business. This does not include email data sources (such as Gmail and Office 365 email) because there is no initial file scan for email services.
  • Initial ingest: The file scan process that indexes all files on an endpoint data source. This scan is performed by the Code42 app installed on each device.
  • Initial extraction: The file scan process that indexes all files in a cloud data source. This scan is performed via a direct connection between Code42 and the cloud service and does not involve the Code42 app on user devices.


What is the initial scan?

Enabling File Metadata Collection initiates different scans for endpoint and cloud data sources:

  • Endpoint data source: The initial ingest scans and indexes all files on the device. This ingest creates a record of all files on the device at the time file metadata collection was enabled.
    • The initial ingest creates New file events for all existing files. As a result, user devices might temporarily use a high percentage of CPU resources. Once the initial scan is complete, Code42 only monitors new and incremental file changes, using significantly fewer CPU resources. 
    • If you disable and then re-enable File Metadata Collection, the initial ingest file scan on the device starts over.
  • Cloud data sources: The initial extraction scans and indexes all files in your organization's cloud drives. This scan does not affect user devices at all; Code42 connects directly to the cloud service to capture this data. After initial extraction, Code42 processes new files in existing drives immediately, and looks for new drives every 24 hours.

How long does the scan take?

  • Endpoint data source: Ingest times vary based on the number and size of files on each device. The processing power of the device and your Code42 CPU usage settings also affect how long it takes to complete. With the recommended CPU usage settings of 50% When user is away and 20% When user is present, devices can take several hours to several days to scan every file on the device.
  • Cloud data sources: For most environments, initial extraction takes between 24 and 48 hours. Timing depends largely on the number of files within the drive, not the size of the files.

Can I start viewing file activity before the scan is fully complete?

  • Endpoint data sources: Yes. As soon as a file is scanned and indexed, file events for that file are visible in Code42. In addition, file activity that may indicate an exposure risk (such as moving files to removable media, uploading to personal cloud services or email) are given priority over indexing all files on the device and are reported in near real-time.
  • Cloud data sources: Yes. File events become visible on a drive-by-drive basis. As soon as scanning completes for a drive, file events for files in that drive are reported.

How do I know when the scan is complete?

Endpoint data source

Scan status is visible in the Code42 app logs on each device:

  1. On the device, open the Code42 app.
  2. Enter the keyboard shortcut Ctrl+Shift+C (Windows) or Option+Command+C (Mac) to open the Code42 Commands interface.
  3. Enter the command getlogs and press Enter.
    The Code42 app compiles the logs and displays the location of the compressed archive.
  4. Navigate to the location of the exported log archive and open the archive.
  5. Locate and open the service.log.0 file.
  6. Search the service.log.0 file for these strings, which indicate the initial scan is complete:
    • Transitioned FFS ingest state from INITIAL_INGEST to SCAN_SUCCESS
    • Transitioned FFS ingest state from SCAN_SUCCESS to STEADY_STATE
      If the above strings do not appear in the log file, the scan is still in process, or the scan completed long enough ago that the messages exist in an older version of the log file (for example service.log.1 or service.log.2).
      Contact our Customer Champions for support if you need help determining the scan status for a device.

Cloud data sources 

In the Code42 console, go to Investigation > Data Sources and review the Status column.

  • A status of Initializing indicates extraction is still in process. As extraction progresses, the Status column indicates the number of drives completed compared to the total number of drives found by Code42. For example, a status of Initializing (42/50) indicates 42 drives are complete and 8 drives remain (for a total of 50 total drives).
    • Once a drive is complete, file events for files in that drive are visible in Code42.
    • Only personal drives are included in this count. Google shared drives are not included.
    • If the status indicates all drives are complete (for example, Initializing 50/50), Code42 may still be scanning Google shared drives.
  • A status of Monitoring indicates extraction is complete.

Why is a device using more CPU than the max allowed?

Because the initial scan reviews and indexes all files on a device, user devices might temporarily use a high percentage of CPU resources.

Code42 app CPU settings apply to the amount of CPU processing time dedicated to Code42, not to total CPU processing capacity. Therefore, if the CPU limit is to 20% (for example), the device's Task Manager or Activity Monitor may report the Code42 app is using more than 20% of the CPU at a particular point in time. 

The processing time of the CPU is measured in instruction cycles. When you limit CPU use for the Code42 app to X%, you are specifying that the Code42 app is allowed use as much of the CPU capacity as it needs for up to X% of the available cycles. For example, if the CPU limit is set to 20%, the Code42 app can use up to 100% of the CPU 20% of the time. The remaining 80% of the time, the CPU prioritizes other process requests. This allows the Code42 app to work as efficiently as possible when it requests CPU resources, but limits the overall impact to the device.

For the best mix of performance and speed, we recommend setting the When user is away, use up to setting to 50% and the When user is present, use up to setting to 20%.

How much memory does Code42 use?

The Code42 app dynamically sets memory allocation to use 25% of the physical memory on the device. For example, if the device has 8GB of RAM, the Code42 app can use up to 2GB.