How Are Flow Records Stored and Processed by GigaFlow?
Database Structure
DATA IN Consists of Netflow and SNMP Collection
STEP 1: Port Receives Flow Can be view by selecting Settings and then selecting Receivers As Flow records come in encounters Receiver process
STEP 2: Processing As Netflow packet comes in the Sending IP Address is pulled out and if not already registered a new device is registered Devices generated automatically and Netflow is associated with the IP The IP Addresses are stored as numbers (IDs) to improve performance and save disk space This ID can be cross referenced with the relevant IP Address Overall Flow record from each device is taken and stored
STEP 3: FPR (First Packet Response) Monitoring Shows exactly what happens to Flow record of devices as it passes through the system
STEP 4: Stitching While Netflow record comes in ARP & CAM collections also taking place (MAC Addresses, User IDs & User to IP Separate Source) All data received is “Stitched” together to make one large flow record Duration of session Source IP Destination IP App ID Mac Domain User Domain ETC
STEP 5: Flow Record Checkpoints Broken down into 3 Checkpoints 1: Black/White List Evaluation 2: Syn Monitoring 3: Profiling Black White list evaluation determines If there is a warning associated with the record Only after flow record passes through all check points can it be stored Processes 1-3 still take place regardless if data is stored or not After processes take place moves on to next step.
STEP 6: Storing Raw Forensics If data is NOT going to be stored The only process that takes place is Updating of the Interface Summaries Flowsec and Flow storage not taken If data IS going to be stored (Data can be viewed by going into Settings and selecting Infrastructure Devices, then select Interface Settings (for each device) and view the Store flows) Flowsec taken first, deduplicated, put into flow database Unlike devices, Flowsec is only a timestamp Each file represents 1 hour Therefore new table created every hour, not a new file as multiple files may be made and used to make single table For every Flowsec time period an Index is created from the Source & Destination IP Address This Index allows quick searching to see if a certain IP was ever on the network
To find out how much disk space is being used Select Performance on top right of page Select All to view graph of all disk space usage
STEP 7: Anuview Storage Partitioning
This is a chart of Disk Space which shows separation of Data Received Tracks Database types Flowsec Srcadd & dstadd = Indexes created for Flowsec Can occur due to deduplication process Forensics Cannot create indexes as there is a multiplication factor regarding the devices Too much disk space required to keep track of all devices as 10 devices may have the same IP etc. Chart helps to reduce query time as size of square is proportionate to how much disk space is being used
Overview of Partitioning Involves Storing Data to the Database Storage according to 2 Factors Time Device ID By dividing device netflow into different periods it reduces the amount of files the database has to manage Whether data is stored or not Interface Summaries are Updated anyway
SNMP Collection Data collected every 30mins is used in Stitching Process If set for longer than 30 mins people may plug devices in and go undetected from data collection