LiquidFiles Documentation
LiquidFiles Documentation

Attachment Upload Actionscripts

This article outlines how you can install Attachment Upload Actionscripts. This enables the ability to run custom AV solutions in addition to the builtin ClamAV scanner, integration with DLP solutions or any other function where you want to perform some function to determine if a file should be permitted to be uploaded/sent or not.

Installation of custom AV engines and similar is not covered by this guide. As long as you use standard CentOS yum packages, you should be fine.

The Attachment Upload Actionscript that will be executed like:

env EMAIL=<users_email_address> GROUP=<users_group_name> your_actionscript.ext /path/to/uploaded/file

The script uses exit codes to determine if the file was clean/permitted or not.

  • An exit code of 0 means that the attachment will be permitted.
  • An exit code of 1 or above means that the file will be deleted and marked as virus infected.

Any output from the script will be fed back to the user as the reason to why the file was not permitted. The output will be silently ignored if the file was permitted.

By using either the EMAIL of GROUP environment variable, you can create a different policy for different users.

Please note that you can use any programming language that you're comfortable with, that can be executed on the LiquidFiles system. Typically this would mean: perl, ruby, python, bash, sh or c.

A very basic example of a filescan script would like like this:

#!/usr/bin/env ruby

# assign the user and group to variables (not strictly needed for this example)
user=ENV['USER']
group=ENV['GROUP']

if ARGV[0] =~ /\.png$/i
  puts "PNG's are not allowed"
  exit 1
end

This script will simply check if the filename ends in .png or not.

A more complex example like the following assumes that you have the Sophos AV scanner installed:

#!/bin/bash

# Sophos - comercial AV scanner for Unix systems
#
# sweep - sophos scanner tool
#
# Parameters:
#   -q         Quick scan
#   -ss        Don't display anything except on error or virus
#   -archive   Scan compressed files (zip, gzip, arj, cmz, tar, rar, cab)
#
# Exit code:
#     0        No virus has been found
#     3        Virus has been found

file_path="$1"

if [[ -f "$file_path" ]]; then
    result=$(/usr/local/bin/sweep -q -ss -archive  $file_path)
    exit_code=$(echo $?)

     if [[ $exit_code > 0 ]]; then
         echo "Sophos AV result: $result"
     fi
 
exit $exit_code

fi

Performance Considerations

One of the things to keep in mind when creating an ActionScript that scans file on uploads is performance. This is specifically true for AV scanners because they have a large signature database. In general, you want to keep the scanning as fast as possible. Ideally you want to complete the scanning within a coulpe of seconds or the system will appear very slow.

How to test

There's few tests you can do when it comes to performance testing. The first is running the scanning on its own and you can time it with the unix time command, like this:

# time /path/to/your/scanner /etc/hosts
/etc/hosts: OK

real	0m0.441s
user	0m0.005s
sys	0m0.008s

This timing is actually from the builtin ClamAV scanner. As you can see, this scan completes in 0.441s on this (not very fast) test system. In this case we're scanning the file /etc/hosts and you can pick anything small so you get sort of the fastest possible time it would take.

The next relevant test is not as easy to do automatically and what you want to do is to select about 20 small files in the File Upload selector or drop on the Drop section for the upload. The reason you want to select 20 or so files is that the background scanner can scan 10 files in parallel, so with 20 files that's twice over. This will catch any memory related issues, more on that further down. And lets say your custom AV scanner loads its AV database into memory and each scanner is running individually from the rest, that's going to be about 1 GB each, or 10 GB just in the AV signatures alone when there's 10 scanners running at the same time. This would then need to be tested again with typical, and the largest files your user use so you can get an idea of real world performance. It could for instance be that your scanner not only loads the AV signatures into memory but also the file it's scanning so if you're sending Gigabyte sized files, you need to allocate 10 x Gigabytes in RAM for the scanner to do its thing. Of course, on the other hand, it could well be that your scanner only consumes a few Megabytes of memory and you don't need to do anything special. And it needs to be tested.

You want to make sure the system isn't swapping during scanning or the performance is going to be really, really poor. If the scanner you're planning to use behaves like described in the previous paragraph, you want to give the system at least 32 GB but more likely 64 GB of RAM to be sure. This is obviously a LOT of RAM, and a LOT more than what the system needs with any custom scanners, so if you can run the scanner with less memory requirements that's going much more efficient.

Tests with fast vs slow scanners

If we look at the builtin ClamAV scanner as an example, there's two versions of this scanner clamscan and clamdscan. Lets look at the difference by scanning /etc/hosts, a very small text file. This test uses the unix time command as described above to see how long it takes on a test system in AWS with 2 GB of RAM and absolutely zero load except the AV scanning:

# time clamscan /etc/hosts
/etc/hosts: OK

----------- SCAN SUMMARY -----------
Known viruses: 8672527
Engine version: 0.103.9
Scanned directories: 0
Scanned files: 1
Infected files: 0
Data scanned: 0.00 MB
Data read: 0.00 MB (ratio 0.00:1)
Time: 88.468 sec (1 m 28 s)
Start Date: 2023:09:16 07:58:45
End Date:   2023:09:16 08:00:13

real	1m28.511s
user	0m33.783s
sys	0m2.271s

This took 1m28s. Now, lets try that with clamdscan:

# time clamdscan /etc/hosts
/etc/hosts: OK

----------- SCAN SUMMARY -----------
Infected files: 0
Time: 0.426 sec (0 m 0 s)
Start Date: 2023:09:16 08:01:33
End Date:   2023:09:16 08:01:34

real	0m0.441s
user	0m0.005s
sys	0m0.008s

This took 0.4s. Quite a difference.

To understand the difference between the two we need to understand how they work. If we begin with clamscan, the simpler of the two, it loads the AV signature database into memory and then scans the file. The ClamAV signature database is about 1 GB in size. The other one, clamdscan talks to a server process clamd where the AV database is already loaded into memory. When LiquidFiles scans your files for viruses, it uses this clamdscan method.

This means that with clamdscan, we can scan 100 files in about 4s of processing time (we're running 10 scans in parallel), plus whatever time it would take to read the file we're scanning from the disk for each file. With clamscan though, it will take a long time to complete. In the ideal scenario, we can scan 10 files in parallel so that would be 10 x 1m28s (about 15 minutes), but in reality it's probably going to be at least double that because parallel processing is never as effective and when the scanner reads the AV signature database from the disk, you now have 10 scanners each reading a GigaByte's worth of AV signatures all at the same time and that's going to take a lot of time.

We also have the considerations about memory so with this setup the system would need a LOT of RAM to function properly. If this was a real world scenarion we would need to carefully test this because even as is, going from about 4s to 15 minutes in the absolute ideal scenario is already there going to appear very slow for users.

The reason we're commenting on this is that we've seen this on a customer system that was running McAfee uvscan AV scanner. Apparently there's no clamd version of uvscan, i.e. the only way to run it is to run it like clamscan above. When we did the test on their system that had 16 GB of RAM, time uvscan /etc/hosts took over 5 minutes. Someone selecting 100 small files to send would have taken hours to scan with real world load.

Everyone's need are a bit difference, but several hours to scan a 100 files is hardly a usable system. With a system like uvscan where there's no server process (daemon) that can preload AV signatures, it's always going to be challenging. If you're a bit of a unix wizard, one thing you could potentially do is to create a RAM disk and load the AV signatures onto the RAM disk when the system starts. You could then configure uvscan to load its AV signatures from the copy on the RAM disk so now you're loading from RAM to RAM which is going to be MUCH, MUCH faster. But this would also consume quite a bit of RAM. Maybe start with 32 - 64 GB of RAM and adjust from there. You want to make sure you keep swapping to a minimum to get decent performance.

Conclusion

This section highlights the importance of testing your scanner for performance and testing it in real world scenarios so you know what to expect when your users are using the system.

Also please note that any customization like your own ActionScripts is outside of support that we provide, more than general advice like on this page. If your scanner is consuming GigaBytes of RAM and taking a long time to scan, we can't assist in troubleshooting this issue. If you don't have the skills in-house to do this, you would have to engage the vendor you've gotten the scanner from that could assist in troubleshooting this issue or point you to a direction that improves performance for you.