Menu

Common function description

Common function description

When using custom queries, the following types of functions are commonly used:

  • Date and Time Functions
  • String Functions
  • Math Functions

Date and Time Functions

The time-related fields in custom queries are divided into the following three types:

1. The time field in the events table

The time field is of the millisecond-level Timestamp type, and all date and time functions can be used directly.

2. The date field in the events table

The date field is of the day-level Timestamp type. If you do not need the information about the hours, minutes, and seconds, using this field will be more efficient. The date field is also an indexed field, so you should use this field as much as possible to filter the date range. Please refer to the "Date Filter" section for details.

Note: Before version 1.10, the date field did not support custom functions and could be replaced by the time field.

3. Other custom Date/Datetime type attributes

This type of attribute is displayed as a millisecond-level Unix timestamp in custom queries. When using date and time functions, you need to convert it to a Timestamp type using the EPOCH_TO_TIMESTAMP function. Please refer to the "Data Type" section for details.

adddate(timestamp startdate, int days), adddate(timestamp startdate, bigint days)

Purpose: Adds a given number of days to a TIMESTAMP value.

Parameters:

  • startdate: The starting timestamp of type Timestamp.
  • days: The number of days to add. Positive values represent the number of days after the start date, and negative values represent the number of days before the start date.

Return Value: The timestamp after adding the number of days, of the type Timestamp.

datediff(timestamp enddate, timestamp startdate)

Purpose: Returns the number of days between two timestamps, for example:

Parameters:

  • enddate: The end time
  • startdate:The start time

Return Value: The number of days between the end time and the start time, in integer. If the date of the first parameter time is later than that of the second parameter time, a positive value is returned; conversely, if the date of the first parameter time is earlier than that of the second parameter time, a negative value is returned.

extract(unit FROM timestamp), extract(timestamp, string unit)

Purpose: To extract a numerical time domain from a TIMESTAMP value, such as year, month, day, hour, minute, and second/microsecond.

Parameters

  • The time unit unit string can take values such as: year, month, day, hour, minute, second, millisecond.

Return Value: The integer value of the time domain.

For example: Querying the number of payment orders so far by year and month.

SELECT extract(Year from time) AS Year, extract(Month from time) AS Month, COUNT(*) FROM events 
WHERE event = 'payOrder' 
GROUP BY Year, Month 
ORDER BY Year, Month

trunc(timestamp, string unit)

Purpose: To extract a time domain from the given timestamp time stamp.

Parameters

  • unit: Time unit
    • SYYYY, YYYY, YEAR, SYEAR, YYY, YY, Y:YEAR
    • Q:QUARTERLY
    • MONTH, MON, MM, RM: MONTH
    • WW, W: The date of the corresponding week's first day.
    • DDD, DD, J: DATE
    • DAY, DY, D: The date of the corresponding week's first day.
    • HH, HH12, HH24: HOUR
    • MI: MINUTE

Return Value: The date after the time domain has been extracted.

For example: The number of events that occurred every day in the last 100 days and the number of days elapsed between the event occurrence time and the current date.

SELECT datediff(now(), trunc(time, 'DD')), COUNT(*) FROM events 
WHERE date >= CURRENT_DATE() - INTERVAL '100' day 
GROUP BY 1

String Functions

concat(string a, string b…)

Purpose: Concatenates all string-type parameters into one string type.

Parameter:

  • string (unlimited number): The string to be joined

Return Value: The concatenated string

Example: Querying address of users who were born after 2000, where the address is a concatenation of province and area.

SELECT concat($province, $city) As Address 
FROM users 
WHERE yearofbirth > 2000

regexp_like(string source, string pattern[, string options])

Purpose: Determines whether the string source contains a content that matches the regular expression pattern

Parameters:

  • source: The string to check
  • pattern: The regular expression
  • option(Optional): Options
    • c: Case-sensitive matching
    • i: Case-insensitive matching
    • m: Multiple matches. The ^ and $ operators will match the start and end of each line instead of the start and end of the string as a whole.
    • n:Newline matching: the dot (.) operator will match a newline. Repetition operators like . can match multiple lines in the source string. (Newlines can be skipped using. skip a few lines)

Return Value: Boolean type indicating whether it matches or not

Example: The number of users with QQ mailbox as the email

SELECT COUNT(*) FROM users 
WHERE regexp_like(email, '@qq.com$')

parse_url(string urlString, string partToExtract [, string keyToExtract])

Purpose: Extract specified part of the URL to return the truncated value

Parameters:

  • urlString: URL
  • partToExtract: The part of the URL to truncate. Valid values are 'PROTOCOL', 'HOST', 'PATH', 'REF', 'AUTHORITY', 'FILE', ‘USERINFO', ‘QUERY'
    • PROTOCOL: The protocol, such as HTTP, HTTPS, FTP, etc.
    • HOST: The host name
    • PATH: The path
    • REF: The anchor, which is the string after the # in the URL
    • AUTHORITY: Authorization
    • FILE: File Name
    • USERINFO: User Information
    • QUERY: Query Parameters, i.e. the string after the URL
  • keyToExtract (optional): When partToExtract is 'QUERY', specify the key of query key-value pairs to get the specified parameter value

Return Value: The extracted value of the specified part in the URL

For example: The access distribution of various paths in the page view event of the day

SELECT parse_url(url, 'PATH'), COUNT(*) FROM events 
WHERE date = CURRENT_DATE() AND event = '$pageview' 
GROUP BY 1

Mathematical Functions

Mathematical functions are used for some numerical operations. In particular, when doing exponentiation, please use the pow() function instead of the exponentiation operator '**'.

pow(double a, double p), power(double a, double p), dpow(double a, double p), fpow(double a, double p)

Usage: Taking exponentiation, for example:

Parameters:

  • a: the base number
  • b: the exponent

Return Value: a to the power of b

For example: Query the number of users whose total principal and interest after the maturity of financial products exceeds 100,000.

SELECT count(distinct(user_id)) FROM events 
WHERE event = 'buyProduct' AND (capital + capital * pow(rateofinterest,duration)) > 100000

round(double a), round(double a, int d), round(decimal a, int_type d), dround(double a), dround(double a, int d)

Usage: Return the rounded value, for example:

Parameters:

  • a: the value to be rounded
  • d (optional): the number of decimal places to keep. If there is no such parameter, keep up to the integer part.

Return Value: The rounded value

For example: Query the number of users whose yield rate of financial products exceeds 0.45 percentage points.

SELECT count(distinct(user_id)) FROM events 
WHERE event = 'buyProduct' AND round((income/capital),4) * 100 > 0.45

truncate(double_or_decimal a[, digits_to_leave]), dtrunc(double_or_decimal a[, digits_to_leave])

Usage: Removing the decimal part of a number, for example:

Parameters:

  • a: The value to be truncated
  • digits_to_leave (optional): The number of decimal places to keep, if this parameter is not present, it will be rounded to the integer part

Return value: The truncated value


Impala function

Please refer to: Impala Function Reference Document


Custom Impala Functions of Sensors Analytics

percentile_count function

SELECT day,percentile_count(perf_metric_ic_m_copy, 0.3) FROM events 
group by 1



Previous
Data Tables and Types
Next
Metric Analysis
Last modified: 2024-12-27