SQL 编码风格指南
本指南旨在提供一套统一的 SQL 编码风格规范,以提高代码的可读性、可维护性和一致性。
1. 命名约定 (Naming Conventions)
1.1. 表名 (Table Names)
表名应当使用蛇形命名法 (snake_case) 的复数名词。
推荐:
1
2
select * from users;
select * from visit_logs;
不推荐:
1
2
select * from user; -- 应为复数
select * from visitLog; -- 应为 snake_case
1.2. 列别名 (Column Aliases)
列别名应使用蛇形命名法 (snake_case)。
推荐:
1
2
3
4
5
select
id,
email,
timestamp_trunc(created_at, month) as signup_month
from users;
不推荐:
1
2
3
4
5
select
id,
email,
timestamp_trunc(created_at, month) as SignupMonth -- 应为 snake_case
from users;
1.3. 公用表表达式 (CTE) 命名
为公用表表达式 (CTE) 使用具有描述性的名称。
推荐:
1
2
3
with ordered_details as (
...
)
不推荐:
1
2
3
with d1 as ( -- 名称过于随意,无法体现其作用
...
)
2. 代码布局 (Code Layout)
2.1. 大小写 (Capitalization)
SQL 关键字和函数名应全部使用小写。
推荐:
1
select * from users;
不推荐:
1
SELECT * FROM users; -- 关键字不应大写
不推荐:
1
Select * From users; -- 关键字不应首字母大写
2.2. 缩进与对齐 (Indentation and Alignment)
左对齐所有关键字,不要为了对齐而手动添加不必要的空格。
推荐:
1
2
3
select id, email
from users
where email like '%@gmail.com';
不推荐:
1
2
3
select id, email
from users -- "from" 不应缩进
where email like '%@gmail.com'; -- "where" 不应缩进
2.3. 逗号位置 (Comma Position)
逗号应位于行尾。
推荐:
1
2
3
4
select
id,
email
from users;
不推荐:
1
2
3
4
select
id
, email -- 逗号应在上一行末尾
from users;
2.4. 多行查询 (Multi-line Queries)
只有在查询非常简单(例如,只查询一列或进行简单计数)时才使用单行 SQL。对于包含多个列或复杂逻辑的查询,应分散在多行中以提高可读性。
推荐 (单行):
1
2
3
select * from users;
select id from users;
select count(*) from users;
推荐 (多行):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
select
id,
email,
created_at
from users;
select *
from users
where email = 'example@domain.com';
select
user_id,
count(*) as total_charges
from charges
group by user_id;
不推荐:
1
2
3
4
5
6
select id, email, created_at -- 多列查询应换行
from users;
select id,
email -- 列对齐不一致
from users;
3. 查询语法 (Query Syntax)
3.1. AS 关键字 (AS Keyword)
为别名使用 AS 关键字,因为明确优于隐式。
推荐:
1
2
3
4
5
select
id,
email,
timestamp_trunc(created_at, month) as signup_month
from users;
不推荐:
1
2
3
4
5
select
id,
email,
timestamp_trunc(created_at, month) signup_month -- 缺少 AS 关键字
from users;
3.2. 连接 (Joins)
inner join 中的 on 子句应当另起一行并缩进。当有多个连接条件时,将每个条件放在它们自己的缩进行上。
推荐:
1
2
3
4
5
6
select
users.email,
sum(charges.amount) as total_revenue
from users
inner join charges on users.id = charges.user_id
group by email;
推荐 (多条件):
1
2
3
4
5
6
7
8
select
users.email,
sum(charges.amount) as total_revenue
from users
inner join charges on
users.id = charges.user_id and
charges.refunded = false
group by email;
不推荐:
1
2
3
4
5
6
7
select
users.email,
sum(charges.amount) as total_revenue
from users
inner join charges -- "on" 子句应与 "join" 在同一行或另起一行并缩进
on users.id = charges.user_id
group by email;
显式声明连接类型(inner join, left join 等)优于隐式(join),这样连接类型会非常清晰。
推荐:
1
2
3
4
5
select
users.email,
sum(charges.amount) as total_revenue
from users
inner join charges on users.id = charges.user_id;
不推荐:
1
2
3
4
5
select
users.email,
sum(charges.amount) as total_revenue
from users
join charges on users.id = charges.user_id; -- 隐式的 "join" 默认为 "inner join",但不够明确
3.3. 表别名 (Table Aliases)
应避免给表名取不必要的别名,除非是自连接或表名过长。如果使用别名,别名应具有意义。
推荐 (有意义的别名):
1
2
3
4
5
6
select
companies.com_name,
beacons.created_at
from stg_mysql_helpscout__helpscout_companies as companies
inner join stg_mysql_helpscout__helpscout_beacons_v2 as beacons
on companies.com_id = beacons.com_id;
不推荐 (不清晰的别名):
1
2
3
4
5
select
u.email,
sum(c.amount) as total_revenue
from users as u -- "u" 和 "c" 是无意义的单字母别名
inner join charges as c on u.id = c.user_id;
3.4. 列引用 (Column References)
当查询不涉及连接时,可以省略表名。
推荐:
1
2
3
4
select
id,
name
from companies;
不推荐:
1
2
3
4
select
companies.id,
companies.name
from companies;
但当涉及连接时,最好明确地写上表名,这样可以清楚地知道列的来源。
推荐:
1
2
3
4
5
select
users.email,
sum(charges.amount) as total_revenue
from users
inner join charges on users.id = charges.user_id;
不推荐:
1
2
3
4
5
select
email,
sum(amount) as total_revenue
from users
inner join charges on users.id = charges.user_id; -- "email" 和 "amount" 的来源不明确
3.5. WHERE 子句 (WHERE Clause)
当只有一个条件时,可以将其放在 where 同一行。如果存在多个条件,则每个条件都应比 where 缩进一个级别,并将逻辑运算符 and/or 放在前一个条件的末尾。
推荐 (单条件):
1
2
3
select email
from users
where id = 1234;
推荐 (多条件):
1
2
3
4
5
select id, email
from users
where
created_at >= '2019-03-01' and
vertical = 'work';
3.6. GROUP BY 子句 (GROUP BY Clause)
在 GROUP BY 子句中使用列别名或位置引用,以保持简洁。
推荐 (使用别名):
1
2
3
4
5
select
timestamp_trunc(com_created_at, year) as signup_year,
count(*) as total_companies
from companies
group by signup_year;
推荐 (使用位置引用):
1
2
3
select user_id, count(*) as total_charges
from charges
group by 1; -- "1" 代表第一列 "user_id"
不推荐:
1
2
3
4
5
select
timestamp_trunc(com_created_at, year) as signup_year,
count(*) as total_companies
from companies
group by timestamp_trunc(com_created_at, year); -- 重复了复杂的表达式
3.7. IN 子句 (IN Clause)
对于长列表,将值格式化为多个缩进的行。
推荐:
1
2
3
4
5
6
7
8
select *
from users
where email in (
'user-1@example.com',
'user-2@example.com',
'user-3@example.com',
'user-4@example.com'
);
逗号前后不应有空格。
推荐:
1
2
3
select *
from users
where id in (1, 2);
不推荐:
1
2
3
select *
from users
where id in ( 1, 2 ); -- 括号内和逗号周围不应有空格
3.8. CASE 语句 (CASE Statements)
case 和每个 when 都需要独占一行,并且 when 比 case 缩进深一行。then 应该和其对应的 when 在同一行,或者在下一行并有更深一级缩进。
推荐:
1
2
3
4
5
6
7
select
case
when event_name = 'viewed_homepage' then 'Homepage'
when event_name = 'viewed_editor' then 'Editor'
else 'Other'
end as page_name
from events;
同样推荐:
1
2
3
4
5
6
7
8
9
select
case
when event_name = 'viewed_homepage'
then 'Homepage'
when event_name = 'viewed_editor'
then 'Editor'
else 'Other'
end as page_name
from events;
不推荐:
1
2
3
4
5
6
select
case when event_name = 'viewed_homepage' then 'Homepage' -- "case" 和 "when" 应在不同行
when event_name = 'viewed_editor' then 'Editor'
else 'Other'
end as page_name
from events;
3.9. 公用表表达式 (CTEs)
优先使用 CTE 而不是子查询,因为 CTE 可以让你的查询逻辑更清晰、更易于阅读。建议在 CTE 链的末尾定义一个名为 final 的 CTE,并最终从 final 中查询。
推荐:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
with ordered_details as (
select
user_id,
name,
row_number() over (partition by user_id order by date_updated desc) as details_rank
from billingdaddy.billing_stored_details
),
final as (
select user_id, name
from ordered_details
where details_rank = 1
)
select * from final;
不推荐:
1
2
3
4
5
6
7
8
9
select user_id, name
from ( -- 使用了子查询,可读性较差
select
user_id,
name,
row_number() over (partition by user_id order by date_updated desc) as details_rank
from billingdaddy.billing_stored_details
) as ranked
where details_rank = 1;
注:CTE (Common Table Expressions) 在 PostgreSQL, Oracle, MS SQL Server, 以及 MySQL 8.0+ 版本中得到支持。
3.10. 字符串引号 (String Quotes)
始终使用单引号 (') 来表示字符串。双引号 (") 在某些 SQL 方言中可能被解释为列名或标识符。
推荐:
1
2
3
select *
from users
where email = 'example@domain.com';
不推荐:
1
2
3
select *
from users
where email = "example@domain.com"; -- 双引号可能导致兼容性问题
3.11. 不等于操作符 (Not-equal Operator)
使用 != 而不是 <>。虽然两者在功能上通常等价,但 != 在多种编程语言中更为通用。
推荐:
1
2
3
select count(*) as paying_users_count
from users
where plan_name != 'free';
3.12. 布尔值比较 (Boolean Comparisons)
明确地与 true 或 false 进行比较,以增强代码的清晰度。
推荐:
1
2
select * from customers where is_cancelled = true;
select * from customers where is_cancelled = false;
不推荐:
1
2
select * from customers where is_cancelled; -- 隐式比较可能导致误解
select * from customers where not is_cancelled;
4. 列选择 (Column Selection)
4.1. 单列查询 (Single Column Queries)
对于只选择一列的查询,如果使用了别名,请将别名和列放在同一行。
推荐:
1
2
3
4
5
6
select count(*) as total_users
from users;
select timestamp_millis(property_beacon_interest) as expressed_interest_at
from hubspot.contact
where property_beacon_interest is not null;
不推荐:
1
2
3
4
5
6
select count(*) -- 别名应与列在同一行
from users;
select timestamp_millis(property_beacon_interest) -- 别名应与列在同一行
from hubspot.contact
where property_beacon_interest is not null;
4.2. 列的顺序 (Column Order)
将聚合结果列放在前面,以突出查询的核心结果。
推荐:
1
2
3
4
5
select
timestamp_trunc(com_created_at, year) as signup_year,
count(*) as total_companies
from companies
group by signup_year;
不推荐:
1
2
3
4
5
select
count(*) as total_companies,
timestamp_trunc(com_created_at, year) as signup_year
from mysql_helpscout.helpscout_companies
group by signup_year;
在选择表的列时,建议首先放置主键,然后是外键,最后是所有其他列。如果表有系统列(如 created_at, updated_at, is_deleted 等),将这些列放在最后。
推荐:
1
2
3
4
5
select
id,
name,
created_at
from users;
不推荐:
1
2
3
4
5
select
created_at,
name,
id
from users;
4.3. 窗口函数 (Window Functions)
你可以将窗口函数写在一行,也可以拆成多行,这取决于它的长度和复杂性。
推荐 (单行):
1
2
3
4
5
select
user_id,
name,
row_number() over (partition by user_id order by date_updated desc) as details_rank
from billingdaddy.billing_stored_details;
推荐 (多行): ```sql select user_id, name, row_number() over ( partition by user_id order by date_updated desc ) as details_rank from billingdaddy.billing_stored_details;