Mirror of https://github.com/roostorg/osprey github.com/roostorg/osprey
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

1# Osprey Docs 2 3# Osprey Docs 4 5![images/rules_architecture.png](images/rules_architecture.png) 6 7## Rules 8 9### Creating Rules 10 11Rules in Osprey are written in `Some Madeup Language (SML)` and follow most syntax conventions present in the Osprey Query UI. SML is a subset of Python with additional restrictions to make the rules simpler to craft. 12 13Rules by themselves only create variables, and without a corresponding `WhenRules()` function call, the rule will have no effects outside of evaluation and query functionality. 14 15Rules at present support the following concepts through the `Rule()` function of the same name. 16 17- Name 18 19 `Rule_Name = Rule()` 20 21 The name of the rule also functions as a conventional “RuleId” and the name of the bool that can be used to query individual rule hits in the Osprey Query UI. As a result, changing the name of a rule after activation may affect historical query results in the UI if not logged externally. 22 23- Logic 24 25 `when_all=[]` 26 27 The actual logic that will be used to evaluate Osprey rules is all encompassed as single comma-delimited list of signals within the `when_all` parameter of the Rule() function and supports the use of Labels, Plugins, UDFs and other values to help enrich heuristics. 28 29 At present, when evaluating UDFs or abstracted variables, any `NULL` evaluations in the series will cause the entire rule function to evaluate as `NULL`, which may be undesirable. 30 31- Description 32 33 `description=f''` 34 35 There is an additional string description field that is able to be emitted alongside the rule itself to external systems such as logging and ticketing systems to help enrich work-streams that may benefit from plain-language context on what the rule criteria is and what the rule may intend to do. 36 37 It may be helpful to include dynamic variables as well to help enrich operational workflows that may need to identify specific values related to the trigger criteria. 38 39 40An example is below of a simple rule using various signal evaluations and out-of-the-box UDFs. 41 42```python 43My_Rule_Name_v2 = Rule( 44 when_all=[ 45 # Primary Signal 46 MyFirstValue == True, 47 HasLabel(entity=MyEntityName, label='MyLabel'), 48 ListLength(list=UsersValues) == 5, 49 # Secondary Signal 50 RegexMatch(target=MyStringValue, pattern='(hello|world)'), 51 MySecondValue >= 3, 52 MyThirdValue != Null, 53 # Guardrail Signal 54 (_LocalValue in [1, 2, 3, 5]) or (GlobalValue in ['hello', 'howdy']), 55 not HasLabel(entity=MySecondEntityName, label='MySecondLabel'), 56 ], 57 description=f"{UserA} performed {ActionB} in this way. Emit warning", 58) 59``` 60 61### Instrumenting Rules with WhenRules 62 63The `WhenRules()` function allows for the connection of rules with external services, declarations or internal label modifications by listing Rule objects in sequence within the `rules_any=[]` parameter and `EffectBase`. By default, operators and designers can utilize UDFs with predefined effects such as `DeclareVerdict()`, `LabelAdd()`, and `LabelRemove()` on positive rule evaluations. 64 65Below is an example of the use of a WhenRules() block to verify and email and reject a request. 66 67```python 68WhenRules( 69 rules_any=[ 70 Enabled_Rule_1, 71 Enabled_Rule_2, 72 # Staged_Rule_1, 73 ], 74 then=[ 75 # Verdicts 76 DeclareVerdict(verdict='reject'), 77 # Labels 78 LabelAdd(entity=UserId, label='recently_challenged', expires_after=TimeDelta(days=7)), 79 LabelAdd(entity=UserId, label='verify', apply_if=NotVerified), 80 LabelAdd(entity=Email, label='pending_verify'), 81 LabelAdd(entity=Domain, label='recently_seen', expires_after=TimeDelta(days=7)), 82 ], 83) 84``` 85 86WhenRules() must occur after rule creations within the file, and may become difficult to interpret outcomes of rules if too distributed so it can be beneficial to place any effects toward the bottom of workflows. 87 88## Output Sinks 89 90After the rules are all run, a set of output sinks takes the resulting `ExecutionOutput` and performs additional work based on that data. These may be defined as part of a plugin as a means to perform domain specific work. 91 92Some default use cases include a `StdoutOutputSink` which simply outputs the result to the log, a `KafkaOutputSink` which pipes data to Kafka (used for Osprey UI), or the `LabelsSink` which can add some stateful data to be used in future rules executions. 93 94```python 95class StdoutOutputSink(BaseOutputSink): 96 """An output sink that prints to standard out!""" 97 98 def __init__(self, log_sampler: Optional[DynamicLogSampler] = None): 99 pass 100 101 def will_do_work(self, result: ExecutionResult) -> bool: 102 return True 103 104 def push(self, result: ExecutionResult) -> None: 105 print(f'result: {result.extracted_features_json} {result.verdicts}') 106 107 def stop(self) -> None: 108 pass 109``` 110 111Passing data to these output sinks is standardized through the use of `Effects`, which are outputs of some functions, usually UDFs. 112 113```python 114def push(self, result: ExecutionResult) -> None: 115 users_to_ban = result.effects[BanUserEffect] 116 ban_users(users_to_ban) 117``` 118 119## User Defined Functions (UDFs) 120 121User Defined Functions (UDFs) are plugins that enable users of Osprey to extend and customize their use of the Osprey SML. UDFs are implemented python functions defined and registered as a plugin. They extend the `UDFBase` abstract base class with a set of arguments, and an output. These will be executed whenever called in the sml. 122 123```python 124# example_plugins/text_[contains.py](http://contains.py) 125class TextContainsArguments(ArgumentsBase): 126 text: str 127 phrase: str 128 case_sensitive = False 129 130class TextContains(UDFBase[TextContainsArguments, bool]): 131 def execute(self, execution_context: ExecutionContext, arguments: TextContainsArguments) -> bool: 132 escaped = re.escape(arguments.phrase) 133 pattern = rf'\b{escaped}\b' 134 flags = 0 if [arguments.case](http://arguments.case)_sensitive else re.IGNORECASE 135 regex = re.compile(pattern, flags) 136 return bool([regex.search](http://regex.search)(arguments.text)) 137 138# example_plugins/register_[plugins.py](http://plugins.py) 139@hookimpl_osprey 140def register_udfs(): 141 return [TextContains] 142``` 143 144Usage in SML: 145 146```python 147# example_rules/post_contains_hello.sml 148ContainsHello = Rule( 149 when_all=[ 150 EventType == 'create_post', 151 TextContains(text=PostText, phrase='hello'), 152 ], 153 description='Post contains the word "hello"', 154) 155``` 156 157### Effects 158 159Plugins may also define external effects, which are useful for performing functionality in your primary service. Effects are simply passed to output sinks at the end of a rule run. These UDFs have an output that extends `EffectBase`, and can be called as a result of a `WhenRules`. 160 161```python 162# example_plugins/src/ban_[user.py](http://user.py) 163class BanUser(UDFBase[BanUserArguments, BanUserEffect]): 164 category = UdfCategories.ENGINE 165 166 def execute(self, execution_context: ExecutionContext, arguments: BanUserArguments) -> BanUserEffect: 167 return BanUserEffect( 168 entity=arguments.entity, 169 comment=arguments.comment, 170 ) 171 172# example_rules/post_contains_hello.sml 173WhenRules( 174 rules_any=[ContainsHello], 175 then=[BanUser(entity=UserId, comment='User said "hello"')], 176) 177``` 178 179UDF outputs can also implement the `CustomExtractedFeature` interface - which get persisted in the outputs for the UI. `EffectToCustomExtractedFeatureBase` can also be used when effects need additional processing for use in the UI. 180 181## Labels 182**NOTE: Labels are currently not in v0, so users will be unable to add or edit labels via the UI** 183 184Labels are a standard plugin that enable stateful rules, and touch many parts of Osprey. They are effectively tags on various entities, which can be arbitrarily defined. 185 186### Creating Entities 187 188Labels are applied to Entities, which are dynamically interpreted from outputs of the UDF `EntityJson`, usually applied to pieces of data that are generally consistent across actions such as User ID or email. 189 190```python 191# user.sml 192UserId: Entity[str] = EntityJson( 193 type='User', 194 path='$.user_id' 195) 196``` 197 198It is possible to create new UDFs that also create entities by having the output of UDF set to `EntityT`. 199 200### Adding Labels 201 202Labels can be added in `WhenRules` clause. This will cause the labels output sink to tag the given entity with the given label at the end of the rules run. 203 204```python 205WhenRules( 206 rules_any=[ 207 Sent_Too_Many_DMs, 208 ], 209 then=[ 210 LabelAdd(entity=UserId, label='likely_spammer') 211 ], 212) 213``` 214 215### Using Labels 216 217Since Labels can be retrieved during a rule run, they can be effectively used as state for your rules. 218 219```python 220Should_Warn_User_Of_Spammer = Rule( 221 when_all=[ 222 HasLabel(entity=UserId, label='likely_spammer'), 223 This_Is_A_New_DM, 224 ], 225) 226``` 227 228Labels will also be shown in the UI for entities, and can also be set manually. Note that since the UI only searches across actions, `HasLabel` will not work in the Query UI. Instead, you may use `DidAddLabel`, which will be true when the given action added a label to a specific entity. 229 230```python 231# UI Query 232DidAddLabel(entity_type="UserId", label_name="likely_spammer") 233``` 234 235## Notable Gotchas 236 237### Nulls 238 239Nulls are the case where a rule or variable in SML does not exist. This can occur for many reasons - either a piece of data is missing or a rule didn’t run. Unlike many programming languages, generally rules with null valued variables will not evaluate that rule (and thus, downstream rules will not evaluate either). The exception cases are when nulls are explicitly checked in a rule. For example: 240 241```python 242Thing: int = JsonData(path='$.property_that_doesnt_exist') 243 244# Evaluates to False 245MyFirstRule = Rule(when_all=[ 246 Thing != Null, 247]) 248 249# Skips evaluation and sets to Null 250MySecondRule = Rule(when_all=[ 251 Thing > 1, 252]) 253 254# Skips evaluation and sets to Null 255MyThirdRule = Rule(when_all=[ 256 MySecondRule, 257]) 258``` 259 260## Workflow Structure and File Placement 261 262SML files can be composed to make your rules easier to understand. The `Import` statement allows you to include rules and variables found in other files. 263 264```python 265# models/action_name.sml 266ActionName = "foo" 267 268# main.sml 269Import( 270 rules=[ 271 'models/action_name.sml', 272 'models/http_request.sml', 273 ] 274) 275 276MyRule = Rule(when_all=[ActionName == "foo"]) 277``` 278 279`Require` allows you to selectively run other SML scripts. Requires supports templating and conditionals, allowing scripts to be filtered out if necessary. This is important in situations where some rules or UDFs are particularly expensive to run (such as making a call to an AI service, for example). 280 281```python 282# main.sml 283Require(rule=f'actions/{ActionName}.sml') # will execute 'actions/foo.sml' 284 285Require(rule='ai_services/my_ai_service.sml', require_if=ActionName == "register") 286```