2 Univariate Hawkes
A univariate Hawkes process (Hawkes (1971)) is defined to be a self-exciting temporal point process where the conditional intensity function is given by
Here
Here the parameter
Plotted below is the conditional intensity (
2.1 Simulating from a univariate Hawkes model
The function sim_hawkes()
offers two simulation methods for simulating a realisation of a univariate Hawkes process. The default, with method = 1
, uses the same method as algorithm 2 in Ogata (1981), simulating until a given time horizon (set by argument n
).
<- sim_hawkes(mu = 0.5, alpha = 1, beta = 1.5, n = 40, plot = TRUE) sim
The second option, setting method = 2
, uses an accept/reject framework and the argument n
specifies the number of points to simulate (default n = 100
).
<- sim_hawkes(mu = 0.5, alpha = 1, beta = 1.5, plot = TRUE, method = 2) sim
2.2 Fitting a univariate Hawkes model
require(stelfi)
To fit a univariate Hawkes model in stelfi
use the function fit_hawkes()
with the following required arguments
times
- a vector of numeric occurrence times, andparameters
- a vector of named starting values for (mu
), (alpha
), and (beta
).
The function get_coefs()
can then be called on the fitted model object to return the estimated parameter values.
2.2.1 A simulated example
Simulating a realisation of a univariate Hawkes process (see Section 2.1 for more details) with
<- sim_hawkes(mu = 1.3, alpha = 0.4, beta = 1.5, n = 500) times
## starting values
<- c(mu = 1.3, alpha = 0.4, beta = 1.5)
sv ## using stelfi
<- fit_hawkes(times = times, parameters = sv)
fit <- get_coefs(fit)
stelfi stelfi
Estimate Std. Error
mu 1.3997320 0.1125668
alpha 0.3737118 0.1207772
beta 1.7730216 0.7495558
As a comparison, below emhawkes
(Lee (2023)), and hawkesbow
(Cheysson (2021)) are used to fit a univariate Hawkes process to the same simulated data.
## benchmark using emhawkes
require(emhawkes)
<- new("hspec", mu = sv[1], alpha = sv[2], beta = sv[3])
h ## emhawkes requires the inter arrival times to fit the model
<- diff(times)
inter <- hfit(object = h, inter_arrival = inter)
fit_em <- summary(fit_em)$estimate
em em
Estimate Std. error t value Pr(> t)
mu1 1.3976553 0.1091437 12.805644 1.524557e-37
alpha1 0.3683585 0.1380857 2.667608 7.639335e-03
beta1 1.7380101 0.8236957 2.110015 3.485708e-02
## bench mark using hawkesbow
require(hawkesbow)
<- mle(events = times, kern = "Exponential", end = max(times))
fit_bow ## use the Hessian to obtain the standard errors from hawkesbow
<- cbind(Estimate = fit_bow$par,
bow "Std. error" = -fit_bow$model$ddloglik(times, max(times)) |> solve() |> diag() |> sqrt())
bow
Estimate Std. error
[1,] 1.3998368 0.11238176
[2,] 0.2107781 0.05814911
[3,] 1.7724499 0.74877227
The table below gives the estimated parameter values from each of stelfi
, emhawkes
, and hawkesbow
along with the standard errors in brackets. Note that hawkesbow
estimates
TRUTH | 1.300 | 0.400 | 1.500 | 0.267 |
stelfi | 1.4 ( 0.113 ) | 0.374 ( 0.121 ) | 1.773 ( 0.75 ) | - |
emhawkes | 1.398 ( 0.109 ) | 0.368 ( 0.138 ) | 1.738 ( 0.824 ) | - |
hawkesbow | 1.4 ( 0.112 ) | - | 1.772 ( 0.749 ) | 0.211 ( 0.058 ) |
2.2.2 An applied example
A NIWA scientist found a working USB in the scat of a leopard seal, they then tweeted about it in the hopes of finding its owner1. The dates and times of these tweets and retweets are available in stelfi
as retweets_niwa
.
data(retweets_niwa)
The dates/times need to be numeric and sorted in ascending order (starting a time
## numeric time stamps
<- unique(sort(as.numeric(difftime(retweets_niwa ,min(retweets_niwa),units = "mins")))) times
The histogram below shows the observed counts (4772) of unique retweet times from the original tweet (
To fit the model chose some starting values for the parameters and supply these along with the numeric times to the function fit_hawkes()
.
<- c(mu = 9, alpha = 3, beta = 10)
sv <- fit_hawkes(times = times, parameters = sv) fit
## print out estimated parameters
<- get_coefs(fit)
pars pars
Estimate Std. Error
mu 0.06328099 0.017783908
alpha 0.07596531 0.007777899
beta 0.07911346 0.008109789
From the estimated coefficients above
- the expected (estimated) background rate of sightings (i.e., independent sightings) is
0.063 3143.02 198.89, which indicates that ~199 retweets (from the 4772) were principal retweets and that the remaining were due to self-excitement; - the expected number of retweets “triggered” by any one retweet is estimated as
= 0.96 (note the maximum this can possibly be is 1); - the expected number of descendants per retweet is estimated as
= 25.13; - the rate of decay for the self-excitement is estimated as
= 12.64, indicating that after ~13 minutes a retweet is likely unrelated2 to the previous retweet.
The show_hawkes()
function can be called on the fitted model object to plot the estimated conditional intensity and the data, top and bottom panels below respectively.
show_hawkes(fit)
2.3 Goodness-of-fit for a univariate Hawkes process
The compensator,
The random change theorem (Daley, Vere-Jones, et al. (2003)) states that if a set of events
The compensator differences can be extracted from a fitted model using the compensator_differences()
function, and a KS test can be carried out manually:
<- sim_hawkes(mu = 1.3, alpha = 0.4, beta = 1.5, n = 500)
times <- c(mu = 2, alpha = 1, beta = 5)
sv <- fit_hawkes(times = times, parameters = sv)
fit <- compensator_differences(fit)
compensator ::ks.test(compensator, "pexp") stats
Asymptotic one-sample Kolmogorov-Smirnov test
data: compensator
D = 0.013855, p-value = 0.9958
alternative hypothesis: two-sided
This gives no evidence against the compensator values coming from a
Another goodness-of-fit test is the Box-Ljung (or Ljung–Box) test, which tests for autocorrelation between the consecutive compensator values (i.e., independence/stationarity).
::Box.test(compensator, type = "Ljung") stats
Box-Ljung test
data: compensator
X-squared = 0.66599, df = 1, p-value = 0.4145
This gives no evidence against the consecutive compensator values being independently distributed.
Alternatively, both tests and some diagnostic plots are returned by calling the show_hawkes_GOF()
function. The four panels an be interpreted as follows
- top left, plots the compensator values against the observed times, which under a well fitting model should align;
- top right, a transformed QQ plot, the observed quantities should align with the theoretical quantiles under a well fitting model;
- bottom left, the compensator differences, which under the model are assumed to be
distributed; - bottom right, consecutive compensator differences, which should show no obvious pattern (no autocorrelation evident) under a well fitting model.
show_hawkes_GOF(fit)
Asymptotic one-sample Kolmogorov-Smirnov test
data: interarrivals
D = 0.013855, p-value = 0.9958
alternative hypothesis: two-sided
Box-Ljung test
data: interarrivals
X-squared = 0.66599, df = 1, p-value = 0.4145
2.4 Fitting an inhomogenous Hawkes model
A univariate inhomogenous Hawkes process has
Below we simulate data from a Hawkes process with hawkesbow
(Cheysson (2021)).
<- function(t) {
mut 1 + 0.5*sin((2*pi*t)/365.25)
}
set.seed(1)
<- hawkesbow::hawkes(1000, fun = mut,
times M = 1.5, repr = 0.5, family = "exp", rate = 2)$p
To fit this model in stelfi
the function background
) and its integral background_integral
) are supplied by the user. To ensure
<- function(params,t){
background = exp(params[[1]])
A = stats::plogis(params[[2]]) * A
B return(A + B*sin((2*pi*t)/365.25))
}
<- function(params,x){
background_integral = exp(params[[1]])
A = stats::plogis(params[[2]]) * A
B return((A*x)-B*cos((2*pi*x)/365.25))
}
These functions are then passed to the function fit_hawkes_cbf()
alongside the observed times (times
) and a list of starting values for both background_parameters
and parameters
respectively.
= list(alpha = 0.5, beta = 1.5)
sv = list(1,1)
background_sv <- fit_hawkes_cbf(times = times, parameters = sv,
fit background = background, background_integral = background_integral, background_parameters = background_sv)
The estimated values of the the transformed fit$background_parameters
, the code below transforms them back to the original scale.
exp(fit$background_parameters[1])
[1] 1.135523
plogis(fit$background_parameters[2]) * exp(fit$background_parameters[1])
[1] 0.7755937
The estimated values of the self-excitement parameters get_coefs()
function as usual. The table below compares the estimated values to the true ones
TRUTH | 1.000 | 0.500 | 1.000 | 2.000 |
stelfi | 1.136 | 0.776 | 0.985 | 2.198 |
The function show_hawkes()
can be used to show the fitted model.
show_hawkes(fit)
The function show_hawkes_GOF()
will show/print the model diagnostics discussed in Section 2.3.
show_hawkes_GOF(fit)
Asymptotic one-sample Kolmogorov-Smirnov test
data: interarrivals
D = 0.018277, p-value = 0.4985
alternative hypothesis: two-sided
Box-Ljung test
data: interarrivals
X-squared = 28.608, df = 1, p-value = 8.86e-08