集合的类 SQL 查询

Groovy 的 groovy-ginq 模块为集合提供了更高层次的抽象。它可以以类似 SQL 的方式对内存中对象的集合执行查询。此外,还可以支持对 XML、JSON、YAML 等的查询,因为它们可以解析为集合。由于 GORM 和 jOOQ 足够强大以支持查询 DB,因此我们将首先介绍集合。

1. GINQ 又名 Groovy 集成查询

GINQ 是一个使用类 SQL 语法的查询 DSL,它包含以下结构

GQ, i.e. abbreviation for GINQ
|__ from
|   |__ <data_source_alias> in <data_source>
|__ [join/innerjoin/leftjoin/rightjoin/fulljoin/crossjoin]*
|   |__ <data_source_alias> in <data_source>
|   |__ on <condition> ((&& | ||) <condition>)* (NOTE: `crossjoin` does not need `on` clause)
|__ [where]
|   |__ <condition> ((&& | ||) <condition>)*
|__ [groupby]
|   |__ <expression> [as <alias>] (, <expression> [as <alias>])*
|   |__ [having]
|       |__ <condition> ((&& | ||) <condition>)*
|__ [orderby]
|   |__ <expression> [in (asc|desc)] (, <expression> [in (asc|desc)])*
|__ [limit]
|   |__ [<offset>,] <size>
|__ select
    |__ <expression> [as <alias>] (, <expression> [as <alias>])*
[] 表示相关子句是可选的,* 表示零次或多次,+ 表示一次或多次。此外,GINQ 的子句是顺序敏感的,因此应将子句的顺序保留为上述结构

正如我们所见,最简单的 GINQ 包含一个 from 子句和一个 select 子句,它看起来像

from n in [0, 1, 2]
select n
GINQ 中只需要 ONLY ONE from 子句。此外,GINQ 通过 from 和相关的联接支持多个数据源。

作为一个 DSL,GINQ 应该被包装在以下代码块中以执行

GQ { /* GINQ CODE */ }

例如,

def numbers = [0, 1, 2]
assert [0, 1, 2] == GQ {
    from n in numbers
    select n
}.toList()
import java.util.stream.Collectors

def numbers = [0, 1, 2]
assert '0#1#2' == GQ {
    from n in numbers
    select n
}.stream()
    .map(e -> String.valueOf(e))
    .collect(Collectors.joining('#'))

强烈建议使用 def 来定义 GINQ 执行结果的变量,该结果是一个延迟的 Queryable 实例。

def result = GQ {
    /* GINQ CODE */
}
def stream = result.stream() // get the stream from GINQ result
def list = result.toList() // get the list from GINQ result
目前,当启用 STC 时,GINQ 无法正常工作。

此外,GINQ 也可以写在用 @GQ 标记的方法中

@GQ
def someGinqMethod() {
    /* GINQ CODE */
}

例如,

  • 使用 @GQ 注解将 ginq 方法标记为 GINQ 方法

@groovy.ginq.transform.GQ
def ginq(list, b, e) {
    from n in list
    where b < n && n < e
    select n
}

assert [3, 4] == ginq([1, 2, 3, 4, 5, 6], 2, 5).toList()
  • 将结果类型指定为 List

import groovy.ginq.transform.GQ

@GQ(List)
def ginq(b, e) {
    from n in [1, 2, 3, 4, 5, 6]
    where b < n && n < e
    select n
}

assert [3, 4] == ginq(2, 5)
GINQ 支持多种结果类型,例如 ListSetCollectionIterableIteratorjava.util.stream.Stream 和数组类型。
  • 启用并行查询

import groovy.ginq.transform.GQ

@GQ(parallel=true)
def ginq(x) {
    from n in [1, 2, 3]
    where n < x
    select n
}

assert [1] == ginq(2).toList()

1.1. GINQ 语法

1.1.1. 数据源

GINQ 的数据源可以通过 from 子句指定,它等效于 SQL 的 FROM。目前,GINQ 支持 IterableStream、数组和 GINQ 结果集作为其数据源

Iterable 数据源
from n in [1, 2, 3] select n
Stream 数据源
from n in [1, 2, 3].stream() select n
数组数据源
from n in new int[] {1, 2, 3} select n
GINQ 结果集数据源
def vt = GQ {from m in [1, 2, 3] select m}
assert [1, 2, 3] == GQ {
    from n in vt select n
}.toList()

1.1.2. 投影

可以使用 as 子句重命名列名

def result = GQ {
    from n in [1, 2, 3]
    select Math.pow(n, 2) as powerOfN
}
assert [[1, 1], [4, 4], [9, 9]] == result.stream().map(r -> [r[0], r.powerOfN]).toList()
可以使用新名称引用重命名的列,例如 r.powerOfN。也可以使用索引引用它,例如 r[0]
assert [[1, 1], [2, 4], [3, 9]] == GQ {
    from v in (
        from n in [1, 2, 3]
        select n, Math.pow(n, 2) as powerOfN
    )
    select v.n, v.powerOfN
}.toList()
当且仅当 n >= 2 时,select P1, P2, …​, Pnselect new NamedRecord(P1, P2, …​, Pn) 的简化语法。此外,如果使用 as 子句,将创建 NamedRecord 实例。存储在 NamedRecord 中的值可以通过其名称引用。

将新对象构造为列值

@groovy.transform.EqualsAndHashCode
class Person {
    String name
    Person(String name) {
        this.name = name
    }
}
def persons = [new Person('Daniel'), new Person('Paul'), new Person('Eric')]
assert persons == GQ {
    from n in ['Daniel', 'Paul', 'Eric']
    select new Person(n)
}.toList()
Distinct

distinct 等效于 SQL 的 DISTINCT

def result = GQ {
    from n in [1, 2, 2, 3, 3, 3]
    select distinct(n)
}
assert [1, 2, 3] == result.toList()
def result = GQ {
    from n in [1, 2, 2, 3, 3, 3]
    select distinct(n, n + 1)
}
assert [[1, 2], [2, 3], [3, 4]] == result.toList()

1.1.3. 过滤

where 等效于 SQL 的 WHERE

from n in [0, 1, 2, 3, 4, 5]
where n > 0 && n <= 3
select n * 2
In
from n in [0, 1, 2]
where n in [1, 2]
select n
from n in [0, 1, 2]
where n in (
    from m in [1, 2]
    select m
)
select n
import static groovy.lang.Tuple.tuple
assert [0, 1] == GQ {
    from n in [0, 1, 2]
    where tuple(n, n + 1) in (
        from m in [1, 2]
        select m - 1, m
    )
    select n
}.toList()
Not In
from n in [0, 1, 2]
where n !in [1, 2]
select n
from n in [0, 1, 2]
where n !in (
    from m in [1, 2]
    select m
)
select n
import static groovy.lang.Tuple.tuple
assert [2] == GQ {
    from n in [0, 1, 2]
    where tuple(n, n + 1) !in (
        from m in [1, 2]
        select m - 1, m
    )
    select n
}.toList()
Exists
from n in [1, 2, 3]
where (
    from m in [2, 3]
    where m == n
    select m
).exists()
select n
Not Exists
from n in [1, 2, 3]
where !(
    from m in [2, 3]
    where m == n
    select m
).exists()
select n

1.1.4. 联接

可以通过联接子句指定 GINQ 的更多数据源。

from n1 in [1, 2, 3]
join n2 in [1, 3] on n1 == n2
select n1, n2
joininnerjoininnerhashjoin 更受欢迎,因为它具有更好的可读性,并且它足够智能,可以根据其 on 子句选择正确的具体联接(即 innerjoininnerhashjoin)。
from n1 in [1, 2, 3]
innerjoin n2 in [1, 3] on n1 == n2
select n1, n2
from n1 in [1, 2, 3]
leftjoin n2 in [2, 3, 4] on n1 == n2
select n1, n2
from n1 in [2, 3, 4]
rightjoin n2 in [1, 2, 3] on n1 == n2
select n1, n2
from n1 in [1, 2, 3]
fulljoin n2 in [2, 3, 4] on n1 == n2
select n1, n2
from n1 in [1, 2, 3]
crossjoin n2 in [3, 4, 5]
select n1, n2

当数据源包含大量对象时,哈希联接特别有效

from n1 in [1, 2, 3]
innerhashjoin n2 in [1, 3] on n1 == n2
select n1, n2
from n1 in [1, 2, 3]
lefthashjoin n2 in [2, 3, 4] on n1 == n2
select n1, n2
from n1 in [2, 3, 4]
righthashjoin n2 in [1, 2, 3] on n1 == n2
select n1, n2
from n1 in [1, 2, 3]
fullhashjoin n2 in [2, 3, 4] on n1 == n2
select n1, n2
哈希联接的 on 子句中只允许二元表达式(==&&

1.1.5. 分组

groupby 等效于 SQL 的 GROUP BYhaving 等效于 SQL 的 HAVING

from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
select n, count(n)
from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
having n >= 3
select n, count(n)
from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
having count() < 3
select n, count()

可以使用 as 子句重命名分组列

from s in ['ab', 'ac', 'bd', 'acd', 'bcd', 'bef']
groupby s.size() as length, s[0] as firstChar
select length, firstChar, max(s)
from s in ['ab', 'ac', 'bd', 'acd', 'bcd', 'bef']
groupby s.size() as length, s[0] as firstChar
having length == 3 && firstChar == 'b'
select length, firstChar, max(s)
聚合函数

GINQ 提供了一些内置的聚合函数

函数 参数类型(s) 返回类型 描述

count()

java.lang.Long

行数,类似于 SQL 中的 count(*)

count(expression)

任何

java.lang.Long

expression 值不为 null 的行数

min(expression)

java.lang.Comparable

与参数类型相同

所有非空值中表达式的最小值

max(expression)

java.lang.Comparable

与参数类型相同

所有非空值中表达式的最大值

sum(expression)

java.lang.Number

java.math.BigDecimal

所有非空值中表达式的总和

avg(expression)

java.lang.Number

java.math.BigDecimal

所有非空值的平均值(算术平均值)

list(expression)

任何

java.util.List

所有非空值的聚合列表

median(expression)

java.lang.Number

java.math.BigDecimal

使非空值在其上方和下方的数量相同的数值(“中间”值,不一定与平均值或平均值相同)

stdev(expression)

java.lang.Number

java.math.BigDecimal

所有非空值的统计标准差

stdevp(expression)

java.lang.Number

java.math.BigDecimal

所有非空值的总体统计标准差

var(expression)

java.lang.Number

java.math.BigDecimal

所有非空值的统计方差

varp(expression)

java.lang.Number

java.math.BigDecimal

所有非空值的总体统计方差

agg(expression)

任何

任何

自定义 expression 中的聚合逻辑并返回单个值

from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
select n, count()
from s in ['a', 'b', 'cd', 'ef']
groupby s.size() as length
select length, min(s)
from s in ['a', 'b', 'cd', 'ef']
groupby s.size() as length
select length, max(s)
from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
select n, sum(n)
from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
select n, avg(n)
from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
select n, median(n)
assert [['A', ['APPLE', 'APRICOT']],
        ['B', ['BANANA']],
        ['C', ['CANTALOUPE']]] == GQL {
    from fruit in ['Apple', 'Apricot', 'Banana', 'Cantaloupe']
    groupby fruit[0] as firstChar
    select firstChar, list(fruit.toUpperCase()) as fruit_list
}
def persons = [new Person('Linda', 100, 'Female'),
               new Person('Daniel', 135, 'Male'),
               new Person('David', 122, 'Male')]
assert [['Male', ['Daniel', 'David']], ['Female', ['Linda']]] == GQL {
    from p in persons
    groupby p.gender
    select p.gender, list(p.name)
}
from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
select n, agg(_g.stream().map(r -> r.n).reduce(BigDecimal.ZERO, BigDecimal::add))
_gagg 聚合函数的隐式变量,它代表分组的 Queryable 对象,其记录(例如 r)可以通过别名(例如 n)引用数据源
from fruit in ['Apple', 'Apricot', 'Banana', 'Cantaloupe']
groupby fruit.substring(0, 1) as firstChar
select firstChar, agg(_g.stream().map(r -> r.fruit).toList()) as fruit_list

此外,我们可以对整个 GINQ 结果应用聚合函数,即不需要 groupby 子句

assert [3] == GQ {
    from n in [1, 2, 3]
    select max(n)
}.toList()
assert [[1, 3, 2, 2, 6, 3, 3, 6]] == GQ {
    from n in [1, 2, 3]
    select min(n), max(n), avg(n), median(n), sum(n), count(n), count(),
            agg(_g.stream().map(r -> r.n).reduce(BigDecimal.ZERO, BigDecimal::add))
}.toList()
assert [0.816496580927726] == GQ {
    from n in [1, 2, 3]
    select stdev(n)
}.toList()
assert [1] == GQ {
    from n in [1, 2, 3]
    select stdevp(n)
}.toList()
assert [0.6666666666666667] == GQ {
    from n in [1, 2, 3]
    select var(n)
}.toList()
assert [1] == GQ {
    from n in [1, 2, 3]
    select varp(n)
}.toList()

1.1.6. 排序

orderby 等效于 SQL 的 ORDER BY

from n in [1, 5, 2, 6]
orderby n
select n
in asc 是可选的,用于以升序排序
from n in [1, 5, 2, 6]
orderby n in asc
select n
from n in [1, 5, 2, 6]
orderby n in desc
select n
from s in ['a', 'b', 'ef', 'cd']
orderby s.length() in desc, s in asc
select s
from s in ['a', 'b', 'ef', 'cd']
orderby s.length() in desc, s
select s
from n in [1, null, 5, null, 2, 6]
orderby n in asc(nullslast)
select n
nullslast 等效于 SQL 的 NULLS LAST 并且默认应用。nullsfirst 等效于 SQL 的 NULLS FIRST
from n in [1, null, 5, null, 2, 6]
orderby n in asc(nullsfirst)
select n
from n in [1, null, 5, null, 2, 6]
orderby n in desc(nullslast)
select n
from n in [1, null, 5, null, 2, 6]
orderby n in desc(nullsfirst)
select n

1.1.7. 分页

limit 类似于 MySQL 的 limit 子句,它可以指定 offset(第一个参数)和 size(第二个参数)用于分页,或者仅指定一个参数作为 size

from n in [1, 2, 3, 4, 5]
limit 3
select n
from n in [1, 2, 3, 4, 5]
limit 1, 3
select n

1.1.8. 嵌套 GINQ

from 子句中嵌套 GINQ
from v in (
    from n in [1, 2, 3]
    select n
)
select v
where 子句中嵌套 GINQ
from n in [0, 1, 2]
where n in (
    from m in [1, 2]
    select m
)
select n
from n in [0, 1, 2]
where (
    from m in [1, 2]
    where m == n
    select m
).exists()
select n
select 子句中嵌套 GINQ
assert [null, 2, 3] == GQ {
    from n in [1, 2, 3]
    select (
        from m in [2, 3, 4]
        where m == n
        limit 1
        select m
    )
}.toList()
建议使用 limit 1 来限制子查询结果的数量,因为如果返回多个值,将抛出 TooManyValuesException

我们可以使用 as 子句来命名子查询结果

assert [[1, null], [2, 2], [3, 3]] == GQ {
    from n in [1, 2, 3]
    select n, (
        from m in [2, 3, 4]
        where m == n
        select m
    ) as sqr
}.toList()

1.1.9. 窗口函数

窗口可以通过 partitionbyorderbyrowsrange 来定义

over(
    [partitionby <expression> (, <expression>)*]
    [orderby <expression> (, <expression>)*
       [rows <lower>, <upper> | range <lower>, <upper>]]
)
  • 0 用作 rowsrange 子句的边界等效于 SQL 的 CURRENT ROW,负数表示 PRECEDING,正数表示 FOLLOWING

  • null 用作 rowsrange 子句的下界等效于 SQL 的 UNBOUNDED PRECEDING

  • null 用作 rowsrange 子句的上界等效于 SQL 的 UNBOUNDED FOLLOWING

此外,GINQ 还提供了一些内置的窗口函数

函数 参数类型(s) 返回类型 描述

rowNumber()

java.lang.Long

其分区内当前行的编号,从 0 开始计数

rank()

java.lang.Long

当前行带有间隙的等级

denseRank()

java.lang.Long

当前行无间隙的等级

percentRank()

java.math.BigDecimal

当前行的相对等级:(等级 - 1)/(总行数 - 1)

cumeDist()

java.math.BigDecimal

当前行的相对等级:(在当前行之前或与当前行同级的行数)/(总行数)

ntile(expression)

java.lang.Long

java.lang.Long

范围从 0expression - 1 的桶索引,尽可能均匀地划分分区

lead(expression [, offset [, default]])

任何 [, java.lang.Long [, 与 expression 类型相同]]

expression 类型相同

在分区内,返回当前行之后偏移量行处的表达式的计算结果;如果没有这样的行,则返回默认值(必须与表达式类型相同)。偏移量和默认值都是相对于当前行计算的。如果省略,则偏移量默认为 1,默认值为 null

lag(表达式 [, 偏移量 [, 默认值]])

任何 [, java.lang.Long [, 与 expression 类型相同]]

expression 类型相同

在分区内,返回当前行之前偏移量行处的表达式的计算结果;如果没有这样的行,则返回默认值(必须与表达式类型相同)。偏移量和默认值都是相对于当前行计算的。如果省略,则偏移量默认为 1,默认值为 null

firstValue(表达式)

任何

与表达式类型相同

返回窗口帧第一行处的表达式的计算结果

lastValue(表达式)

任何

与表达式类型相同

返回窗口帧最后一行处的表达式的计算结果

nthValue(表达式, n)

任意,java.lang.Long

与表达式类型相同

返回窗口帧第 n 行处的表达式的计算结果

count()

java.lang.Long

行数,类似于 SQL 中的 count(*)

count(expression)

任何

java.lang.Long

expression 值不为 null 的行数

min(expression)

java.lang.Comparable

与参数类型相同

所有非空值中表达式的最小值

max(expression)

java.lang.Comparable

与参数类型相同

所有非空值中表达式的最大值

sum(expression)

java.lang.Number

java.math.BigDecimal

所有非空值中表达式的总和

avg(expression)

java.lang.Number

java.math.BigDecimal

所有非空值的平均值(算术平均值)

median(expression)

java.lang.Number

java.math.BigDecimal

使非空值在其上方和下方的数量相同的数值(“中间”值,不一定与平均值或平均值相同)

stdev(expression)

java.lang.Number

java.math.BigDecimal

所有非空值的统计标准差

stdevp(expression)

java.lang.Number

java.math.BigDecimal

所有非空值的总体统计标准差

var(expression)

java.lang.Number

java.math.BigDecimal

所有非空值的统计方差

varp(expression)

java.lang.Number

java.math.BigDecimal

所有非空值的总体统计方差

agg(expression)

任何

任何

孵化阶段:自定义表达式中的聚合逻辑并返回单个值

rowNumber
assert [[2, 1, 1, 1], [1, 0, 0, 2], [null, 3, 3, 3], [3, 2, 2, 0]] == GQ {
    from n in [2, 1, null, 3]
    select n, (rowNumber() over(orderby n)),
              (rowNumber() over(orderby n in asc)),
              (rowNumber() over(orderby n in desc))
}.toList()
assert [[1, 0, 1, 2, 3], [2, 1, 2, 1, 2], [null, 3, 0, 3, 0], [3, 2, 3, 0, 1]] == GQ {
    from n in [1, 2, null, 3]
    select n, (rowNumber() over(orderby n in asc(nullslast))),
              (rowNumber() over(orderby n in asc(nullsfirst))),
              (rowNumber() over(orderby n in desc(nullslast))),
              (rowNumber() over(orderby n in desc(nullsfirst)))
}.toList()
窗口函数周围的括号是必需的。
rank, denseRank, percentRank, cumeDistntile
assert [['a', 1, 1], ['b', 2, 2], ['b', 2, 2],
        ['c', 4, 3], ['c', 4, 3], ['d', 6, 4],
        ['e', 7, 5]] == GQ {
    from s in ['a', 'b', 'b', 'c', 'c', 'd', 'e']
    select s,
        (rank() over(orderby s)),
        (denseRank() over(orderby s))
}.toList()
assert [[60, 0, 0.4], [60, 0, 0.4], [80, 0.5, 0.8], [80, 0.5, 0.8], [100, 1, 1]] == GQ {
    from n in [60, 60, 80, 80, 100]
    select n,
        (percentRank() over(orderby n)),
        (cumeDist() over(orderby n))
}.toList()
assert [[1, 0], [2, 0], [3, 0],
        [4, 1], [5, 1],
        [6, 2], [7, 2],[8, 2],
        [9, 3], [10, 3]] == GQ {
    from n in 1..10
    select n, (ntile(4) over(orderby n))
}.toList()
leadlag
assert [[2, 3], [1, 2], [3, null]] == GQ {
    from n in [2, 1, 3]
    select n, (lead(n) over(orderby n))
}.toList()
assert [[2, 3], [1, 2], [3, null]] == GQ {
    from n in [2, 1, 3]
    select n, (lead(n) over(orderby n in asc))
}.toList()
assert [['a', 'bc'], ['ab', null], ['b', 'a'], ['bc', 'ab']] == GQ {
    from s in ['a', 'ab', 'b', 'bc']
    select s, (lead(s) over(orderby s.length(), s in desc))
}.toList()
assert [['a', null], ['ab', null], ['b', 'a'], ['bc', 'ab']] == GQ {
    from s in ['a', 'ab', 'b', 'bc']
    select s, (lead(s) over(partitionby s.length() orderby s.length(), s in desc))
}.toList()
assert [[2, 1], [1, null], [3, 2]] == GQ {
    from n in [2, 1, 3]
    select n, (lag(n) over(orderby n))
}.toList()
assert [[2, 3], [1, 2], [3, null]] == GQ {
    from n in [2, 1, 3]
    select n, (lag(n) over(orderby n in desc))
}.toList()
assert [['a', null], ['b', 'a'], ['aa', null], ['bb', 'aa']] == GQ {
    from s in ['a', 'b', 'aa', 'bb']
    select s, (lag(s) over(partitionby s.length() orderby s))
}.toList()
assert [[2, 3, 1], [1, 2, null], [3, null, 2]] == GQ {
    from n in [2, 1, 3]
    select n, (lead(n) over(orderby n)), (lag(n) over(orderby n))
}.toList()

可以指定除默认偏移量 1 之外的其他偏移量

assert [[2, null, null], [1, 3, null], [3, null, 1]] == GQ {
    from n in [2, 1, 3]
    select n, (lead(n, 2) over(orderby n)), (lag(n, 2) over(orderby n))
}.toList()

当偏移量指定的索引超出窗口范围时,可以返回默认值,例如 'NONE'

assert [[2, 'NONE', 'NONE'], [1, 3, 'NONE'], [3, 'NONE', 1]] == GQ {
    from n in [2, 1, 3]
    select n, (lead(n, 2, 'NONE') over(orderby n)), (lag(n, 2, 'NONE') over(orderby n))
}.toList()
firstValue, lastValuenthValue
assert [[2, 1], [1, 1], [3, 2]] == GQ {
    from n in [2, 1, 3]
    select n, (firstValue(n) over(orderby n rows -1, 1))
}.toList()
assert [[2, 3], [1, 2], [3, 3]] == GQ {
    from n in [2, 1, 3]
    select n, (lastValue(n) over(orderby n rows -1, 1))
}.toList()
assert [[2, 2], [1, 1], [3, 3]] == GQ {
    from n in [2, 1, 3]
    select n, (firstValue(n) over(orderby n rows 0, 1))
}.toList()
assert [[2, 1], [1, null], [3, 1]] == GQ {
    from n in [2, 1, 3]
    select n, (firstValue(n) over(orderby n rows -2, -1))
}.toList()
assert [[2, 1], [1, null], [3, 2]] == GQ {
    from n in [2, 1, 3]
    select n, (lastValue(n) over(orderby n rows -2, -1))
}.toList()
assert [[2, 3], [1, 3], [3, null]] == GQ {
    from n in [2, 1, 3]
    select n, (lastValue(n) over(orderby n rows 1, 2))
}.toList()
assert [[2, 3], [1, 2], [3, null]] == GQ {
    from n in [2, 1, 3]
    select n, (firstValue(n) over(orderby n rows 1, 2))
}.toList()
assert [[2, 2], [1, 1], [3, 3]] == GQ {
    from n in [2, 1, 3]
    select n, (lastValue(n) over(orderby n rows -1, 0))
}.toList()
assert [[2, 1], [1, 1], [3, 1]] == GQ {
    from n in [2, 1, 3]
    select n, (firstValue(n) over(orderby n rows null, 1))
}.toList()
assert [[2, 3], [1, 3], [3, 3]] == GQ {
    from n in [2, 1, 3]
    select n, (lastValue(n) over(orderby n rows -1, null))
}.toList()
assert [['a', 'a', 'b'], ['aa', 'aa', 'bb'], ['b', 'a', 'b'], ['bb', 'aa', 'bb']] == GQ {
    from s in ['a', 'aa', 'b', 'bb']
    select s, (firstValue(s) over(partitionby s.length() orderby s)),
            (lastValue(s) over(partitionby s.length() orderby s))
}.toList()
assert [[1, 1, 2, 3, null], [2, 1, 2, 3, null], [3, 1, 2, 3, null]] == GQ {
    from n in 1..3
    select n, (nthValue(n, 0) over(orderby n)),
              (nthValue(n, 1) over(orderby n)),
              (nthValue(n, 2) over(orderby n)),
              (nthValue(n, 3) over(orderby n))
}.toList()
min, max, count, sum, avg, median, stdev, stdevp, var ,varpagg
assert [['a', 'a', 'b'], ['b', 'a', 'b'], ['aa', 'aa', 'bb'], ['bb', 'aa', 'bb']] == GQ {
    from s in ['a', 'b', 'aa', 'bb']
    select s, (min(s) over(partitionby s.length())), (max(s) over(partitionby s.length()))
}.toList()
assert [[1, 2, 2, 2, 1, 1], [1, 2, 2, 2, 1, 1],
        [2, 2, 2, 4, 2, 2], [2, 2, 2, 4, 2, 2],
        [3, 2, 2, 6, 3, 3], [3, 2, 2, 6, 3, 3]] == GQ {
    from n in [1, 1, 2, 2, 3, 3]
    select n, (count() over(partitionby n)),
              (count(n) over(partitionby n)),
              (sum(n) over(partitionby n)),
              (avg(n) over(partitionby n)),
              (median(n) over(partitionby n))
}.toList()
assert [[2, 6, 3, 1, 3, 4], [1, 6, 3, 1, 3, 4],
        [3, 6, 3, 1, 3, 4], [null, 6, 3, 1, 3, 4]] == GQ {
    from n in [2, 1, 3, null]
    select n, (sum(n) over()),
              (max(n) over()),
              (min(n) over()),
              (count(n) over()),
              (count() over())
}.toList()
assert [[1, 1, 1], [2, 2, 3], [5, 2, 10], [5, 2, 10]] == GQ {
    from n in [1, 2, 5, 5]
    select n, (count() over(orderby n range -2, 0)),
              (sum(n) over(orderby n range -2, 0))
}.toList()
assert [[1, 2, 3], [2, 1, 2], [5, 2, 10], [5, 2, 10]] == GQ {
    from n in [1, 2, 5, 5]
    select n, (count() over(orderby n range 0, 1)),
              (sum(n) over(orderby n range 0, 1))
}.toList()
assert [[1, 2, 3], [2, 2, 3], [5, 2, 10], [5, 2, 10]] == GQ {
    from n in [1, 2, 5, 5]
    select n, (count() over(orderby n range -1, 1)),
              (sum(n) over(orderby n range -1, 1))
}.toList()
assert [[1, 1, 2], [2, 0, 0], [5, 0, 0], [5, 0, 0]] == GQ {
    from n in [1, 2, 5, 5]
    select n, (count() over(orderby n in desc range 1, 2)),
              (sum(n) over(orderby n in desc range 1, 2))
}.toList()
assert [[1, 0, 0], [2, 1, 1], [5, 0, 0], [5, 0, 0]] == GQ {
    from n in [1, 2, 5, 5]
    select n, (count() over(orderby n in desc range -2, -1)),
              (sum(n) over(orderby n in desc range -2, -1))
}.toList()
assert [[1, 3, 12], [2, 2, 10], [5, 0, 0], [5, 0, 0]] == GQ {
    from n in [1, 2, 5, 5]
    select n, (count() over(orderby n range 1, null)),
              (sum(n) over(orderby n range 1, null))
}.toList()
assert [[1, 2, 3], [2, 2, 3], [5, 4, 13], [5, 4, 13]] == GQ {
    from n in [1, 2, 5, 5]
    select n, (count() over(orderby n range null, 1)),
              (sum(n) over(orderby n range null, 1))
}.toList()
assert [[1, 0.816496580927726],
        [2, 0.816496580927726],
        [3, 0.816496580927726]] == GQ {
    from n in [1, 2, 3]
    select n, (stdev(n) over())
}.toList()
assert [[1, 1], [2, 1], [3, 1]] == GQ {
    from n in [1, 2, 3]
    select n, (stdevp(n) over())
}.toList()
assert [[1, 0.6666666666666667],
        [2, 0.6666666666666667],
        [3, 0.6666666666666667]] == GQ {
    from n in [1, 2, 3]
    select n, (var(n) over())
}.toList()
assert [[1, 1], [2, 1], [3, 1]] == GQ {
    from n in [1, 2, 3]
    select n, (varp(n) over())
}.toList()
assert [[1, 4], [2, 2], [3, 4]] == GQ {
    from n in [1, 2, 3]
    select n,
           (agg(_g.stream().map(r -> r.n).reduce(BigDecimal.ZERO, BigDecimal::add)) over(partitionby n % 2))
}.toList()

1.2. GINQ 技巧

1.2.1. 行号

_rn 是表示结果集中每条记录的行号的隐式变量。它从 0 开始

from n in [1, 2, 3]
select _rn, n

1.2.2. 列表推导

列表推导是一种基于现有列表定义和创建列表的优雅方式

assert [4, 16, 36, 64, 100] == GQ {from n in 1..<11 where n % 2 == 0 select n ** 2}.toList()
assert [4, 16, 36, 64, 100] == GQ {from n in 1..<11 where n % 2 == 0 select n ** 2} as List
assert [4, 16, 36, 64, 100] == GQL {from n in 1..<11 where n % 2 == 0 select n ** 2}
GQL {…​}GQ {…​}.toList() 的缩写

GINQ 可以直接在循环中用作列表推导

def result = []
for (def x : GQ {from n in 1..<11 where n % 2 == 0 select n ** 2}) {
    result << x
}
assert [4, 16, 36, 64, 100] == result

1.2.3. 查询和更新

这类似于 SQL 中的 update 语句

import groovy.transform.*
@TupleConstructor
@EqualsAndHashCode
@ToString
class Person {
    String name
    String nickname
}

def linda = new Person('Linda', null)
def david = new Person('David', null)
def persons = [new Person('Daniel', 'ShanFengXiaoZi'), linda, david]
def result = GQ {
    from p in persons
    where p.nickname == null
    select p
}.stream()
    .peek(p -> { p.nickname = 'Unknown' }) // update `nickname`
    .toList()

def expected = [new Person('Linda', 'Unknown'), new Person('David', 'Unknown')]
assert expected == result
assert ['Unknown', 'Unknown'] == [linda, david]*.nickname // ensure the original objects are updated

1.2.4. with 子句的替代方案

GINQ 目前不支持 with 子句,但我们可以定义一个临时变量来解决这个问题

def v = GQ { from n in [1, 2, 3] where n < 3 select n }
def result = GQ {
    from n in v
    where n > 1
    select n
}
assert [2] == result.toList()

1.2.5. case-when 的替代方案

SQL 的 case-when 可以用 switch 表达式替换

assert ['a', 'b', 'c', 'c'] == GQ {
    from n in [1, 2, 3, 4]
    select switch (n) {
        case 1 -> 'a'
        case 2 -> 'b'
        default -> 'c'
    }
}.toList()

1.2.6. 查询 JSON

import groovy.json.JsonSlurper
def json = new JsonSlurper().parseText('''
    {
        "fruits": [
            {"name": "Orange", "price": 11},
            {"name": "Apple", "price": 6},
            {"name": "Banana", "price": 4},
            {"name": "Mongo", "price": 29},
            {"name": "Durian", "price": 32}
        ]
    }
''')

def expected = [['Mongo', 29], ['Orange', 11], ['Apple', 6], ['Banana', 4]]
assert expected == GQ {
    from f in json.fruits
    where f.price < 32
    orderby f.price in desc
    select f.name, f.price
}.toList()

1.2.7. 并行查询

并行查询在查询大型数据源时特别有效。默认情况下它处于禁用状态,但我们可以手动启用它

assert [[1, 1], [2, 2], [3, 3]] == GQ(parallel: true) {
    from n1 in 1..1000
    join n2 in 1..10000 on n2 == n1
    where n1 <= 3 && n2 <= 5
    select n1, n2
}.toList()

由于并行查询将使用共享线程池,以下代码可以在所有 GINQ 语句执行完成后释放资源,并且它将等待所有线程任务完成。

GQ {
    shutdown
}
一旦发出 shutdown 命令,并行查询将不再工作。

以下代码等效于上述代码,换句话说,immediate 是可选的

GQ {
    shutdown immediate
}

在不等待任务完成的情况下关闭

GQ {
    shutdown abort
}

1.2.8. 自定义 GINQ

对于高级用户,您可以通过指定自己的目标代码生成器来自定义 GINQ 行为。例如,我们可以指定限定类名 org.apache.groovy.ginq.provider.collection.GinqAstWalker 作为目标代码生成器,以生成用于查询集合的 GINQ 方法调用,这是 GINQ 的默认行为

assert [0, 1, 2] == GQ(astWalker: 'org.apache.groovy.ginq.provider.collection.GinqAstWalker') {
    from n in [0, 1, 2]
    select n
}.toList()

1.2.9. 优化 GINQ

GINQ 优化器默认情况下处于启用状态,以提高性能。它将转换 GINQ AST 以实现更好的执行计划。我们可以手动禁用它

assert [[2, 2]] == GQ(optimize: false) {
    from n1 in [1, 2, 3]
    join n2 in [1, 2, 3] on n1 == n2
    where n1 > 1 &&  n2 < 3
    select n1, n2
}.toList()

1.3. GINQ 示例

1.3.1. 生成乘法表

from v in (
    from a in 1..9
    join b in 1..9 on a <= b
    select a as f, b as s, "$a * $b = ${a * b}".toString() as r
)
groupby v.s
select max(v.f == 1 ? v.r : '') as v1,
       max(v.f == 2 ? v.r : '') as v2,
       max(v.f == 3 ? v.r : '') as v3,
       max(v.f == 4 ? v.r : '') as v4,
       max(v.f == 5 ? v.r : '') as v5,
       max(v.f == 6 ? v.r : '') as v6,
       max(v.f == 7 ? v.r : '') as v7,
       max(v.f == 8 ? v.r : '') as v8,
       max(v.f == 9 ? v.r : '') as v9

1.3.2. 更多示例

链接:最新的 GINQ 示例

上述链接中的一些示例需要运行 Groovy 的最新 SNAPSHOT 版本。