Real-Time Decisioning Feature Platform

Hidden Use the Real-Time Decisioning Feature Platform to create and manage features. Features are data attributes or properties that describe individual characteristics of a dataset. You can use features to perform analytical tasks (such as using machine learning models for prediction), or perform operational tasks (such as using rules to take actions) related to fraud detection. Real-Time Decisioning’s Feature Platform lets you manage your data more efficiently. It allows your fraud analysts and data scientists to focus on the tasks of developing detection and prevention strategies rather than spending too much time generating features. Feature Platform offers:

Out-of-the-box features for risk and fraud use cases.
A platform for analysts and data scientists to build, test, and deploy both simple and complex features in production either by using an intuitive UI or by programming code snippets.
User interfaces that allow you to easily generate feature templates and add them to feature packages so that other users can share and reuse those packages.
The ability to generate features in batches or in real time, in a low-latency, high queries-per-second production environment.
The ability to move features between different production environments for easy maintenance.

Creating features

Using Feature Platform, you can create new features from your dataset in two ways:

Create features by user interface (UI) or code snippets from scratch
Create features by importing features from another production environment

Feature Platform distinguishes between two types of features:

Aggregation features: Features that contain information related to a specific time span
Non-aggregation features: Features that do not contain time-related information

You can create features in the Feature Platform directly in the UI, or by writing code. By offering these two options, Real-Time Decisioning gives you greater control over your feature creation process. To create a feature, follow this procedure:

Navigate to the Feature Platform Dashboard.

Navigate to the Create Feature page by clicking Feature Platform > Create Feature from the dropdown menu, or click the blue CREATE FEATURE button in the middle of the page.

On the Create New Feature page, create a new feature:

Click USE UI to create a feature in the UI.
Click USE CODING to develop a new feature in Java, Python, or SQL.

Using the UI to create a feature

To create a feature using the UI, follow this procedure:

Enter a Feature Name, Description, and Tags for your new feature. Marqeta recommends keeping the feature name within 80 characters to ensure that it is displayed fully in the Case Management module or in the Knowledge Graph.

Select an Operator Category and an Operator Function.

Select the Input Parameter Type for the data input fields that are used in the feature.

Test your feature (optional).

About operators

In Feature Platform, an operator is a predefined logic or transformation that can be applied to raw data fields or existing features in order to generate a new feature. For example, you can have a simple operator that adds two columns of your data together to compute the sum, an operator that returns the prefix of an email address, or an operator that determines an IP address’ country of origin. Operators are classified into several categories, as shown below:

Aggregation: Aggregation operators are applied over a time span to derive the feature value. For example, COLLECT aggregates all values within a time frame and saves them to a list.
Generic: Generic operators include arithmetic or logical operations, as well as simple text manipulation. For example, AND_DV_DEFAULT has two conditional inputs (condition_1, condition_2). If both conditions are satisfied, it returns TRUE. Otherwise, it returns FALSE.
Digital Entity: Digital entity operators apply to email addresses, IP addresses, and phone numbers.
Time: Time operators transform fields into required date and time fields such as Time to Week and Time to Date.
Text: Text operators manipulate strings of text.
Math: Math operators include commonly used math operations such as sqrt and power.
Geolocation: Geolocation operators calculate the distance between pairs of postal codes and/or addresses.
Attribute Specific: Attribute-specific operators are intended for specific attribute fields. For example, EMAIL_PREFIX extracts the string before the @ sign in an email address.
Non-Aggregation: Non-aggregation operators are operators that do not depend on an aggregation of values over a time span.

Use the Explore All Operators option available at the bottom of the operator selection list to view and search all available operators.

Creating a non-aggregation feature

When you create a custom non-aggregation feature, you must select a non-aggregation operator function. The following example demonstrates how to create a non-aggregation custom feature. In this example, your dataset has a data field named email that corresponds to the email address of a user. You want to create a non-aggregation feature that corresponds to the prefix of the email address (which is the substring preceding the @ symbol) and call this new feature EMAIL_PREFIX_EXAMPLE. To create the EMAIL_PREFIX_EXAMPLE feature, follow this procedure:

Name your feature and add a description. You can also add any appropriate tags associated with the new feature. Adding a description and tags facilitates searching for that newly created feature in your Feature List.

Select a non-aggregation function to apply to a specific field. You can narrow down the list of operator functions by type, or you can click EXPLORE ALL OPERATOR FUNCTIONS to search and enable the specific operator you want to use. For this procedure, select the EMAIL_PREFIX operator.

Feature Platform displays the operator’s Return Type to inform you of the data type of your functions’ output. The return type for the EMAIL_PREFIX operator is String.

After you select your operator, Feature Platform displays the Enter Parameters window. Choose one of the following options:

Select the email input parameter from the dropdown list of features. Input parameters selected from the list are either raw data fields or an input feature. When the input changes, the output feature value changes accordingly.
Select Input Parameter Type as a Constant and manually enter an email address such as test@example.com. Constant input parameters are fixed values. In this example, the returned output will always be the constant value of test for all rows in your dataset.

Click CREATE FEATURE to add the new feature to your Feature List for use in modeling or rules, or click TEST to try out your new feature with your data before you finish creating it.

Creating an aggregation feature

Aggregation features are customizable behavioral features that aggregate events related to a specific entity over a specific time period, allowing you to do time-span analysis to answer complex questions such as the frequency of specific activity types. An aggregation feature requires an Aggregator by which the input data will be grouped and analyzed. An aggregator is the Real-Time Decisioning equivalent of the SQL GROUP BY clause, but it is more powerful, particularly across time spans. The following example demonstrates how to create a custom aggregation feature. In this example, your dataset has an ip field indicating the IP address of each customer request, and you want to know how many unique billing addresses were associated with the IP address 64.237.37.122 across all customer requests in the past. Because this number could be very large, you should limit your search to the last 10 days, with a sliding window approach over time. This example uses the distinct count aggregator operation. Distinct count is a built-in operator under the Aggregation category. To create the new aggregation feature, follow this procedure:

On the Create New Feature page, name your feature and add a description. Apply any necessary tags. You can apply more tags later, if needed.

Select the aggregation category under Select Operator to limit the number of available functions.

Select the aggregation operator distinct count. The distinct count operator returns the Integer data type. For every IP address within the selected time range and dataset, the total number of distinct billing addresses will be represented by a single number.

In this example, selecting the function is not enough to build the feature. You also need to know the data fields and time span to which you will apply the distinct count operator in your new feature. Because you are calculating a distinct count of billing addresses per IP address, specify the aggregation target as address and the Aggregated By value as ip.

Select the time span for data aggregation to finish creating the aggregation feature. Feature Platform allows you to specify time spans in days, hours, minutes, and even seconds. The selected time period can range from a minimum of one second to a maximum of 180 days. In real-time integration, if you want to skip very recent data when calculating a feature, select Exclude Now. In this example, select the last ten days. There are two ways set this value:

Use the graphic sliding ruler to mark the start and end time of the desired aggregation time window.
Manually specify the desired time period using the Start Day and End Day dropdown lists.

Aggregation features are usually defined to support data aggregation for different event types. For example, you may want to count the total number of transactions when the transaction is a Send event instead of all events. You have the option to specify the applicable event types in the Select Event Types section. You can choose to include or exclude specific types. Exercise caution when using the Any Event Type option, because it will include all future event types that are added to the system automatically. In this example, you only want to calculate this feature for events of the application event type.

In the Select Event Types field, select Include application. Marqeta recommends that you always select the applicable event types associated with your specific use cases when defining aggregation features so that the features are stronger in predicting fraud risk.

You can further refine the newly created aggregator by adding more conditions. In this example, you do not want to calculate the distinct count of billing addresses for all IP addresses, only 64.237.37.122. Use the Add Conditions section to introduce filters based on specific attributes.

In the Add Conditions section, click the Click to Add Criteria option.
Set the attribute as ip, set the operator to STR_EQ (string equals), and choose the option for Constant Value, where you can enter the IP address 64.237.37.122. In this example, one condition is sufficient. For more complex cases, you can click +ADD CONDITION again to add more conditions, and click +ADD BLOCK to chain the conditions together by AND and OR blocks. You can use multiple AND/OR logic blocks to support more complex conditions.

Click CREATE FEATURE. As with non-aggregation features, you have the option to click TEST on a sample of your data.

Creating a feature from a list or a data source

You can use data source features to look up a value from a data source by key. For example, if you use the Blocklist or Allowlist system, you may want to create a Boolean feature that checks if an entity appears on either of these lists. You can use the Create Feature from Blocklist/Allowlist option to define such a feature. To create a feature from a custom data source, you have more options. You can choose one or multiple output fields to create one or more features. If all you want to do is verify whether a value exists in a data source, check the Exists/Absent option.

Hotspot aggregation features

Some aggregation features may calculate a large number of events in a short time. For example, calculating the number of unique customer IP addresses from a certain region per merchant for a 24-hour window in real time might require evaluating millions of records. To avoid degrading system performance, Real-Time Decisioning limits the total number of aggregated events for some aggregation operators to 1000. Below is a partial list of aggregation operators that observe the 1000 event limit:

count
average
max
min
first
last
distinct count
stdev
collect
outlier
zscore
median

Creating new features using code

If you need to create custom features that cannot easily be built through the UI, you can create custom features directly in code. The Java, Python, and SQL languages are supported by the Real-Time Decisioning Feature Platform. When you select Use Coding under Create New Feature, The Real-Time Decisioning Feature Platform displays a console in which you can write code to generate a feature. First, select your desired language from the dropdown list: Java, Python, or SQL. Then write the code snippets in the coding console. Your code snippets must end with a return statement to return the value of your new feature. In your code snippets, you may need to reference data fields from the input datasets, or refer to other existing features. Format such references using the $ symbol in front of your data field name or feature name in your code. When you enter the $ symbol, a dropdown list will appear to help you select the relevant features or data fields.

Note
Libraries for Python and Java (for example, Math for Java or NumPy for Python) are not currently accessible through the coding console. However, you will have access to a set of Real-Time Decisioning utility functions that may be helpful for generating features.

The following example demonstrates how to extract the prefix of an email address using code, similar to the previous example using the UI.

On the Create New Feature page, select Use Coding to display the coding console.

Write your code or script in Python or Java to extract the prefix of an email address, using $email to reference the email field in your dataset. The coding console includes some IDE-like features, such as autocomplete and syntax highlighting. For example, default String functionality is available in Java. Similarly, any comments made in the code will be preserved if you go back to view this feature.

Complete the Additional Details section, as you would when creating a feature using the UI. You must name your feature under Feature Name and specify a return type under Return Type. The Feature Platform currently supports the following return types:

Integer
Float
String
Boolean
Set
Dict
List

The Description and Tags fields are optional, but Marqeta suggests adding them to facilitate searching in the future.

You can optionally test this feature before finalizing its creation. If you choose to test your feature, you are directed to the Test Feature page.

Click CREATE FEATURE to finish creating the new feature and add it to your Feature List, where you can access it later on for modeling or creating rules.

Regardless of whether you choose to test your new feature, Feature Platform displays an error message if your code contains any errors.

Creating new features using generative AI

When you create a feature in Java or Python script, you can take advantage of Real-Time Decisioning’s latest generative AI-driven tools to create features with ease and assurance for code correctness.

Generating feature scripts from instructions

When you create a feature using code, you can write instructions and generate feature scripts from those instructions. To reference existing features in the script, insert a $ symbol in front of the feature name. For example, the instruction “Create a feature that returns true if the user address $address is a PO Box” generates a feature script that checks for the existence of the string PO Box in the feature named address. You can further enhance your code by adding checks for other potential strings, such as POBox.

Improving feature scripts with generative AI

After you enter the feature script, use the Suggest Enhancement tool to check for potential feature enhancements. This tool leverages both generative AI and a set of predetermined rules to check if the feature script is syntactically correct and if it handles potential runtime null pointer errors adequately. The Suggest Enhancement tool displays suggested changes to the right of the original code, with the differences highlighted. Click the Accept button to apply the suggested changes.

Managing, testing, and backfilling features

Managing features

On the Feature Platform homepage, click the FEATURES > AVAILABLE FEATURES > VIEW link to view all features. To customize the information shown on this page, select the Edit Columns button in the upper-right corner of the table to choose which columns to display. Click ALL to display all fields. Click DEFAULT to restore the default selection. For any feature in the table, you can click the vertical ellipsis menu in the rightmost column and choose any of the following actions: Test, Edit, Copy, See Dependency, or Delete.

Edit

For any features in draft status, you can click the vertical ellipsis menu in the rightmost column and choose EDIT, then make modifications to the selected feature on the Edit Current Feature page. If a feature was created in code, you can edit the feature’s source code.

Copy

You can copy a feature to make modifications to it without deleting the original version. From the vertical ellipsis menu, click Copy to copy the selected feature. Feature Platform displays an alert that the newly copied feature must be renamed. All features must have a unique name. Click EDIT to proceed to the Copy Feature page.

See dependency

From the vertical ellipsis menu, click See Dependency to view the dependencies of the selected feature. A hierarchical graph will appear on the right side of the screen containing the event attributes, templates, or features upon which the selected feature is dependent, as well as any features derived from the selected feature. You can also click the Table option in the Feature Dependency window to view the feature dependencies in a table format.

Delete

From the vertical ellipsis menu, click Delete to delete the selected feature. When you delete a feature, Feature Platform prompts you to confirm your actions. Click CONFIRM to delete the feature, or click CANCEL to cancel the action. In the backend, the deletion function checks the dependencies of the selected feature. You cannot delete a feature if any other features or rules depend on it.

View source code

You can view the definition of a feature by clicking its name in the feature list table. To view the definition of the features and operator, click the feature and operator names in the FEATURE DETAILS window.

Testing features

You can test a feature using the Test function available through the vertical ellipsis menu in the rightmost column. Feature Platform directs you to the Test Feature page, where you can select an appropriate dataset to be used for testing.

Note
The SELECT THE DATA SOURCE field only displays validated datasets, along with any datasets that contain the event attributes used in the selected feature. You can confirm which datasets are validated by going to Data Studio and selecting Datasets. Datasets that are Validated With Warnings are still considered valid.

To test a feature, follow this procedure:

Select the feature in the Feature Platform – All Features table, click the vertical ellipsis menu in the rightmost column, then select Test. You can use previous production data or an existing dataset in your test. After you have selected a dataset, the system checks if the selected dataset includes all the necessary dependencies to calculate the feature. If it does not include all necessary dependencies, Feature Platform displays a warning message.

Select the number of records to test. Some datasets may contain multiple records that span several days. You can test your feature using one or more records. Although you can test from the Top 500 records up to the Top 10,000 records, only choosing the top 500 records is recommend for faster test execution.

Click Start Test.

When the test is done, Feature Platform displays the Test Result table. The feature selected for testing is shown in the first column. You can sort, search, and customize the columns displayed in the table. In Test Result, open the Statistics View tab to view the distribution report of each feature in the test result.

To export your results, choose Export In Original Format.

Publishing and unpublishing features

You are ready to publish your feature when you are confident that it is working as expected. Once a feature is published, it cannot be modified until it is moved back into draft status. Only published features are computed in production. You must unpublish a published feature before you can modify it. Unpublishing a feature does not affect the production system. Any rules that use the feature will continue to run with the previously published code until you save your modification. At that point, the modified code will be used in production. If the feature approval workflow is enabled, all publish and unpublish requests will first transition to the pending state until the reviewer has either approved or rejected the requests. When a publishing or unpublishing request is rejected, the feature reverts to its previous state. To publish a feature, follow this procedure:

Navigate to Feature Platform – All Features.

In the Status column, update the feature’s status from Draft to Published.

Click CONFIRM to proceed and publish your feature, or click CANCEL to cancel publishing.

To quickly publish multiple features, click the checkboxes corresponding to the features you want to publish in the leftmost column.

Once you have selected all features you want to publish, click the PUBLISH button.

Click CONFIRM to publish all selected features, or click CANCEL to cancel publishing.

When you publish a feature, all of its dependencies are published at the same time.

Exporting feature configurations

To export feature configurations, click the EXPORT button in the Feature Configuration window. Any feature that is shown in the Feature Platform – Select Features To Export table can be exported as a JSON-formatted file. You can use the exported file to import your features into a new environment. Click the checkboxes corresponding to the features that you want to export, then click EXPORT. You can export multiple feature configurations concurrently. The JSON-formatted file contains the feature configurations necessary to recreate that same feature in a new environment. The exported file contains the following keys:

EVENT_ATTRIBUTE_INFO: High-level fields such as the feature’s name and return type.
FEATURE: Feature-level metadata such as details related to the script that generated the feature and the parameter names.

Backfilling features

Aggregation features depend on historical data across a span of time for aggregation. When new features are added to the production system, it may take some time for them to effectively aggregate and obtain the desired computational result. In order to reduce the required aggregation time so that features of this type can be used for detection as soon as they are put into production, Feature Platform provides a backfill function to calculate the aggregation result with historical data. To use the backfill function, first select the aggregation features for backfill. Click the Backfill button at the top of the page to configure a backfill task. You can choose to use production data to backfill the new features by specifying a time range. If there is not yet enough data available in production, you can use a dataset to supply the necessary historical data. The backfill task may take some time to finish. To check the status of the task, go to Task Center.

Backfill reminders

Feature Platform displays a backfill reminder icon when you attempt to publish an aggregation feature that has not been sufficiently backfilled to calculate the aggregation result for the specified time period. The backfill reminder icon appears for seven days, or until you have performed the required backfill, whichever is earlier.

Bulk updating aggregation features

Use the bulk update function to modify the applicable event types of multiple aggregation features concurrently. Exercise caution when using this powerful function. Access the bulk update function from the Feature Platform console page: After you select the features to update, you have the option to replace all existing configured event types for the selected features, or append additional types to the existing configured event types. You can choose to update the selected features along with those unselected features that share the same aggregation conditions, or to update the selected features only.

Important
The bulk update function does not affect the event type conditions explicitly coded in the aggregation conditions. For this reason, Marqeta recommends that you do not code the event type conditions explicitly in the aggregation condition. Use the Event Type configuration section to specify the event types that should be applied or excluded instead.

Feature reporting

You can download a feature report in CSV format from the Feature Platform Console. The report includes a list of all features, where each feature is used, and other feature-related information. The Where Used field lists all downstream dependent features and rule entities.

Guides

​Creating features

​Using the UI to create a feature

​About operators

​Creating a non-aggregation feature

​Creating an aggregation feature

​Creating a feature from a list or a data source

​Hotspot aggregation features

​Creating new features using code

​Creating new features using generative AI

​Generating feature scripts from instructions

​Improving feature scripts with generative AI

​Managing, testing, and backfilling features

​Managing features

​Edit

​Copy

​See dependency

​Delete

​View source code

​Testing features

​Publishing and unpublishing features

​Exporting feature configurations

​Backfilling features

​Backfill reminders

​Bulk updating aggregation features

​Feature reporting

Creating features

Using the UI to create a feature

About operators

Creating a non-aggregation feature

Creating an aggregation feature

Creating a feature from a list or a data source

Hotspot aggregation features

Creating new features using code

Creating new features using generative AI

Generating feature scripts from instructions

Improving feature scripts with generative AI

Managing, testing, and backfilling features

Managing features

Edit

Copy

See dependency

Delete

View source code

Testing features

Publishing and unpublishing features

Exporting feature configurations

Backfilling features

Backfill reminders

Bulk updating aggregation features

Feature reporting