Welcome to RoutingFilter’s documentation!¶
Usage¶
Sample usage:
from routingfilter.routing import Routing
test_event_1 = {
"tags": "mountain_bike",
"wheel_model": "Superlight",
"frame": "aluminium",
"gears": "1x12",
"suspension": "full"
}
test_rule_1 = {
"streams": {
"rules": {
"mountain_bike": [
{
"filters": [
{
"type": "EQUALS",
"key": "wheel_model",
"description": "Carbon fiber wheels needs manual truing",
"value": ["Superlight", "RacePro"]
}
],
"streams": {
"Workshop": {
"workers_needed": 1
}
}
}
]
}
}
}
routing = Routing()
routing.load_from_dict([test_rule_1])
routing.match(test_event_1)
The rule’s top level name (default streams) can be changed. In this case, the new name can be specified when calling the match method.
At the moment, the second level (rule) is hard-coded and will likely be removed in the future, since it has no semantic.
The third level matches a field in the event (default tags). It can be changed when calling the matches method. When the first rule with a given tag (i.e. “mountain_bike”) matches the event, the following are ignored. Rules with different tags can match independently (for example, if we want to send an event to different pipelines, based on the tag).
The “streams” element after filters means that, if the filter matches, the event will be enriched with the Workshop dictionary.
Available filters¶
For filter types which use “key” and “value” field, they can be both a string or a list of strings. The chosen logic is OR (at least a match must be satisfied).
- ALL - matches with everithing, always returns True (if this matches, all other rules are ignored)
- EXISTS - returns True if the key in “key” field exists
- NOT_EXISTS - returns False if the key in “key” field exists
- EQUALS - returns True if the value in the specified “key” is equal to “value”
- STARTSWITH - returns True if a “key“‘s value starts with “value”
- ENDSWITH - returns True if a “key“‘s value ends with “value”
- KEYWORD - returns True if “value” is present in “key” (item in list or string in substring)
- REGEXP - returns True if a “key“‘s value matches the RegExp specified in “value”
- NETWORK - Parses the field into ad IP address or network and returns True if the IP address is contained in the specified network
- NOT_NETWORK - Parses the field into ad IP address or network and returns True if the IP address is NOT contained in the specified network
- DOMAIN - Similar to EQUALS but also tries to parse subdomains (separated by “.”)
- GREATER - returns True if the value in the specified “key” is greater than “value”.
- LESS - returns True if the value in the specified “key” is less than “value”
- GREATER_EQ - returns True if the value in the specified “key” is greater than or equal to “value”
- LESS_EQ - returns True if the value in the specified “key” is less than or equal to “value”
- TYPE_OF - returns True if the target type is the same of the value. Possible types checked are: str, int, bool, list, dict, ip address or mac address
The filters NETWORK and NOT_NETWORK must be strings containing a valid IP or network address (using CIDR notation), otherwise a ValueError is raised. The filters GREATER, LESS, GREATER_EQ, LESS_EQ require float (or float-parsable) values, otherwise a ValueError is raised.
Example¶
Let’s see a routing application with firewall rules. We have network traffic events in the following format:
test_event_n1 = {
"tags": "ip_traffic",
"src_addr": "192.168.1.10",
"dst_addr": "192.168.1.15",
"domain": "test.domain.local"
}
test_event_n2 = {
"tags": "ip_traffic",
"src_addr": "192.168.2.10",
"dst_addr": "192.168.2.15",
"domain": "test.otherdomain.local"
}
We want to filter all traffic tagged as ip_traffic, coming from the subnet 192.168.1.0/24 and enrich all the other events with a new field processed. We create the following rule:
test_rule_n1 = {
"streams": {
"rules": {
"ip_traffic": [
{
"filters": [
{
"type": "NETWORK",
"key": "src_addr",
"value": "192.168.1.0/24"
}
],
"streams": {
"filtered": False
}
}
]
}
}
}
If we apply this rule to test_event_n1, it will match, since the src_addr is in the subnet 192.168.1.0/24. We are obtaining the following output:
{
"rules": [
{
"type": "NETWORK",
"key": "src_addr",
"value": "192.168.1.0/24"
}
],
"output": {
"filtered": False
}
}
The second event will not match with rule test_event_n1, since the src_addr is not in the subnet 192.168.1.0/24. The function will return an empty dictionary.
Routing¶
-
class
routingfilter.routing.
Routing
¶ Bases:
object
-
get_rules
() → Optional[dict]¶ Return the currently loaded rules. It is mainly used for debugging purposes.
Returns: A dict or None Return type: Optional[dict]
-
load_from_dicts
(rules_list: List[dict], validate_rules: bool = True, variables: Optional[dict] = None) → None¶ Load routing configuration from a dictionary. It merges the different rules in list into a single routing rule. It optionally performs some rules validation before accepting them (an exception is raised in case of errors).
Parameters: - rules_list (List[dict]) – The configuration
- validate_rules (bool) – Perform rules validation (default=True). It can be disabled to improve performance (unsafe)
- variables (Optional[dict]) – Variables dictionary to replace rule values
Return type: None
-
load_from_jsons
(rules_list: List[str], validate_rules: bool = True, variables: Optional[dict] = None) → None¶ Load routing configuration from JSON data. It merges the different rules in list into a single routing rule. It optionally performs some rules validation before accepting them (an exception is raised in case of errors).
Parameters: - rules_list (List[str]) – The json data, which will be parsed into a dict
- validate_rules (bool) – Perform rules validation (default=True). It can be disabled to improve performance (unsafe)
- variables (Optional[dict]) – Variables dictionary to replace rule values
Return type: None
-
match
(event: dict, type_: str = 'streams', tag_field_name: str = 'tags') → List[dict]¶ Process a single event message through routing filters and verify if it matches with (at least) one filter. For each top level tag in the rule, only the first matching filter is returned. Multiple dictionaries can only be returned with rules matching different tags.
Parameters: - event (dict) – The entire event to process
- type (str) – The event type (can be ‘streams’, ‘customer’ or everything else, as defined in the routing config). If the type does not exists, an empty list is returned
- tag_field_name (str) – The event field to search into (default=’tags’)
Returns: A list of dicts containing the matched rules and the outputs in the following format: {“rules”: […], “output”: {…}}; an empty list if no rule matched
Return type: List[dict]
-
Indices and tables¶
Keywords¶
It is possible to use keywords in the routing, in order to use variables into the filter values. Keywords are strings that start with $, for example: $BICYCLE_COLORS. The definition of keywords have to be included in a .json file into a dictionaries directory, for example:
{
"BICYCLE_COLORS": ["red","blue"]
}
Benchmark tests¶
The benchmark tests were developed in order to analyze the routing execution time. The tests are available for the EQUALS, STARTSWITH, ENDSWITH, KEYWORD, REGEXP, NETWORK, DOMAIN and GREATER filters. In particular, the tests are: * test1_<FILTER>_no_key_match: it sends 100 messages with 50 fields, but without the wheel_model key * test2_<FILTER>_key_exists: it sends 100 messages with 50 fields and one of the keys is wheel_model, but with a value different from Superlight * test3_<FILTER>_list_values: it sends 100 messages with 50 fields and one of the keys is wheel_model. In addition, the rule contains 100 values * test4_<FILTER>_values_message: both the wheel_model key in the event and the rule include a list of 100 values
In order to launch the benchmark tests, run
python routing_benchmark.py