SQL 编码风格指南

发表于 2017/10/12

作者 deathwhispers

13 分钟阅读

SQL 编码风格指南

本指南旨在提供一套统一的 SQL 编码风格规范，以提高代码的可读性、可维护性和一致性。

1. 命名约定 (Naming Conventions)

1.1. 表名 (Table Names)

表名应当使用蛇形命名法 (snake_case) 的复数名词。

推荐:

  
select * from users;
select * from visit_logs;

不推荐:

  
select * from user; -- 应为复数
select * from visitLog; -- 应为 snake_case

1.2. 列别名 (Column Aliases)

列别名应使用蛇形命名法 (snake_case)。

推荐:

  
select
    id,
    email,
    timestamp_trunc(created_at, month) as signup_month
from users;

不推荐:

  
select
    id,
    email,
    timestamp_trunc(created_at, month) as SignupMonth -- 应为 snake_case
from users;

1.3. 公用表表达式 (CTE) 命名

为公用表表达式 (CTE) 使用具有描述性的名称。

推荐:

  
with ordered_details as (
  ...
)

不推荐:

  
with d1 as ( -- 名称过于随意，无法体现其作用
  ...
)

2. 代码布局 (Code Layout)

2.1. 大小写 (Capitalization)

SQL 关键字和函数名应全部使用小写。

推荐:

  
select * from users;

不推荐:

  
SELECT * FROM users; -- 关键字不应大写

不推荐:

  
Select * From users; -- 关键字不应首字母大写

2.2. 缩进与对齐 (Indentation and Alignment)

左对齐所有关键字，不要为了对齐而手动添加不必要的空格。

推荐:

  
select id, email
from users
where email like '%@gmail.com';

不推荐:

  
select id, email
  from users -- "from" 不应缩进
 where email like '%@gmail.com'; -- "where" 不应缩进

2.3. 逗号位置 (Comma Position)

逗号应位于行尾。

推荐:

  
select
    id,
    email
from users;

不推荐:

  
select
    id
    , email -- 逗号应在上一行末尾
from users;

2.4. 多行查询 (Multi-line Queries)

只有在查询非常简单（例如，只查询一列或进行简单计数）时才使用单行 SQL。对于包含多个列或复杂逻辑的查询，应分散在多行中以提高可读性。

推荐 (单行):

  
select * from users;
select id from users;
select count(*) from users;

推荐 (多行):

  
select
    id,
    email,
    created_at
from users;

select *
from users
where email = 'example@domain.com';

select
    user_id,
    count(*) as total_charges
from charges
group by user_id;

不推荐:

  
select id, email, created_at -- 多列查询应换行
from users;

select id,
    email -- 列对齐不一致
from users;

3. 查询语法 (Query Syntax)

3.1. AS 关键字 (AS Keyword)

为别名使用 AS 关键字，因为明确优于隐式。

推荐:

  
select
    id,
    email,
    timestamp_trunc(created_at, month) as signup_month
from users;

不推荐:

  
select
    id,
    email,
    timestamp_trunc(created_at, month) signup_month -- 缺少 AS 关键字
from users;

3.2. 连接 (Joins)

inner join 中的 on 子句应当另起一行并缩进。当有多个连接条件时，将每个条件放在它们自己的缩进行上。

推荐:

  
select
    users.email,
    sum(charges.amount) as total_revenue
from users
inner join charges on users.id = charges.user_id
group by email;

推荐 (多条件):

  
select
    users.email,
    sum(charges.amount) as total_revenue
from users
inner join charges on
    users.id = charges.user_id and
    charges.refunded = false
group by email;

不推荐:

  
select
    users.email,
    sum(charges.amount) as total_revenue
from users
inner join charges -- "on" 子句应与 "join" 在同一行或另起一行并缩进
on users.id = charges.user_id
group by email;

显式声明连接类型（inner join, left join 等）优于隐式（join），这样连接类型会非常清晰。

推荐:

  
select
    users.email,
    sum(charges.amount) as total_revenue
from users
inner join charges on users.id = charges.user_id;

不推荐:

  
select
    users.email,
    sum(charges.amount) as total_revenue
from users
join charges on users.id = charges.user_id; -- 隐式的 "join" 默认为 "inner join"，但不够明确

3.3. 表别名 (Table Aliases)

应避免给表名取不必要的别名，除非是自连接或表名过长。如果使用别名，别名应具有意义。

推荐 (有意义的别名):

  
select
  companies.com_name,
  beacons.created_at
from stg_mysql_helpscout__helpscout_companies as companies
inner join stg_mysql_helpscout__helpscout_beacons_v2 as beacons
        on companies.com_id = beacons.com_id;

不推荐 (不清晰的别名):

  
select
    u.email,
    sum(c.amount) as total_revenue
from users as u -- "u" 和 "c" 是无意义的单字母别名
inner join charges as c on u.id = c.user_id;

3.4. 列引用 (Column References)

当查询不涉及连接时，可以省略表名。

推荐:

  
select
    id,
    name
from companies;

不推荐:

  
select
    companies.id,
    companies.name
from companies;

但当涉及连接时，最好明确地写上表名，这样可以清楚地知道列的来源。

推荐:

  
select
    users.email,
    sum(charges.amount) as total_revenue
from users
inner join charges on users.id = charges.user_id;

不推荐:

  
select
    email,
    sum(amount) as total_revenue
from users
inner join charges on users.id = charges.user_id; -- "email" 和 "amount" 的来源不明确

3.5. WHERE 子句 (WHERE Clause)

当只有一个条件时，可以将其放在 where 同一行。如果存在多个条件，则每个条件都应比 where 缩进一个级别，并将逻辑运算符 and/or 放在前一个条件的末尾。

推荐 (单条件):

  
select email
from users
where id = 1234;

推荐 (多条件):

  
select id, email
from users
where
    created_at >= '2019-03-01' and
    vertical = 'work';

3.6. GROUP BY 子句 (GROUP BY Clause)

在 GROUP BY 子句中使用列别名或位置引用，以保持简洁。

推荐 (使用别名):

  
select
  timestamp_trunc(com_created_at, year) as signup_year,
  count(*) as total_companies
from companies
group by signup_year;

推荐 (使用位置引用):

  
select user_id, count(*) as total_charges
from charges
group by 1; -- "1" 代表第一列 "user_id"

不推荐:

  
select
  timestamp_trunc(com_created_at, year) as signup_year,
  count(*) as total_companies
from companies
group by timestamp_trunc(com_created_at, year); -- 重复了复杂的表达式

3.7. IN 子句 (IN Clause)

对于长列表，将值格式化为多个缩进的行。

推荐:

  
select *
from users
where email in (
  'user-1@example.com',
  'user-2@example.com',
  'user-3@example.com',
  'user-4@example.com'
);

逗号前后不应有空格。

推荐:

  
select *
from users
where id in (1, 2);

不推荐:

  
select *
from users
where id in ( 1, 2 ); -- 括号内和逗号周围不应有空格

3.8. CASE 语句 (CASE Statements)

case 和每个 when 都需要独占一行，并且 when 比 case 缩进深一行。then 应该和其对应的 when 在同一行，或者在下一行并有更深一级缩进。

推荐:

  
select
    case
        when event_name = 'viewed_homepage' then 'Homepage'
        when event_name = 'viewed_editor' then 'Editor'
        else 'Other'
    end as page_name
from events;

同样推荐:

  
select
    case
        when event_name = 'viewed_homepage'
            then 'Homepage'
        when event_name = 'viewed_editor'
            then 'Editor'
        else 'Other'
    end as page_name
from events;

不推荐:

  
select
    case when event_name = 'viewed_homepage' then 'Homepage' -- "case" 和 "when" 应在不同行
        when event_name = 'viewed_editor' then 'Editor'
        else 'Other'
    end as page_name
from events;

3.9. 公用表表达式 (CTEs)

优先使用 CTE 而不是子查询，因为 CTE 可以让你的查询逻辑更清晰、更易于阅读。建议在 CTE 链的末尾定义一个名为 final 的 CTE，并最终从 final 中查询。

推荐:

  
with ordered_details as (

  select
    user_id,
    name,
    row_number() over (partition by user_id order by date_updated desc) as details_rank
  from billingdaddy.billing_stored_details

),

final as (

  select user_id, name
  from ordered_details
  where details_rank = 1

)

select * from final;

不推荐:

  
select user_id, name
from ( -- 使用了子查询，可读性较差
  select
    user_id,
    name,
    row_number() over (partition by user_id order by date_updated desc) as details_rank
  from billingdaddy.billing_stored_details
) as ranked
where details_rank = 1;

注：CTE (Common Table Expressions) 在 PostgreSQL, Oracle, MS SQL Server, 以及 MySQL 8.0+ 版本中得到支持。

3.10. 字符串引号 (String Quotes)

始终使用单引号 (') 来表示字符串。双引号 (") 在某些 SQL 方言中可能被解释为列名或标识符。

推荐:

  
select *
from users
where email = 'example@domain.com';

不推荐:

  
select *
from users
where email = "example@domain.com"; -- 双引号可能导致兼容性问题

3.11. 不等于操作符 (Not-equal Operator)

使用 != 而不是 <>。虽然两者在功能上通常等价，但 != 在多种编程语言中更为通用。

推荐:

  
select count(*) as paying_users_count
from users
where plan_name != 'free';

3.12. 布尔值比较 (Boolean Comparisons)

明确地与 true 或 false 进行比较，以增强代码的清晰度。

推荐:

  
select * from customers where is_cancelled = true;
select * from customers where is_cancelled = false;

不推荐:

  
select * from customers where is_cancelled; -- 隐式比较可能导致误解
select * from customers where not is_cancelled;

4. 列选择 (Column Selection)

4.1. 单列查询 (Single Column Queries)

对于只选择一列的查询，如果使用了别名，请将别名和列放在同一行。

推荐:

  
select count(*) as total_users
from users;

select timestamp_millis(property_beacon_interest) as expressed_interest_at
from hubspot.contact
where property_beacon_interest is not null;

不推荐:

  
select count(*) -- 别名应与列在同一行
from users;

select timestamp_millis(property_beacon_interest) -- 别名应与列在同一行
from hubspot.contact
where property_beacon_interest is not null;

4.2. 列的顺序 (Column Order)

将聚合结果列放在前面，以突出查询的核心结果。

推荐:

  
select
  timestamp_trunc(com_created_at, year) as signup_year,
  count(*) as total_companies
from companies
group by signup_year;

不推荐:

  
select
  count(*) as total_companies,
  timestamp_trunc(com_created_at, year) as signup_year
from mysql_helpscout.helpscout_companies
group by signup_year;

在选择表的列时，建议首先放置主键，然后是外键，最后是所有其他列。如果表有系统列（如 created_at, updated_at, is_deleted 等），将这些列放在最后。

推荐:

  
select
    id,
    name,
    created_at
from users;

不推荐:

  
select
    created_at,
    name,
    id
from users;

4.3. 窗口函数 (Window Functions)

你可以将窗口函数写在一行，也可以拆成多行，这取决于它的长度和复杂性。

推荐 (单行):

  
select
    user_id,
    name,
    row_number() over (partition by user_id order by date_updated desc) as details_rank
from billingdaddy.billing_stored_details;

推荐 (多行): ```sql select user_id, name, row_number() over ( partition by user_id order by date_updated desc ) as details_rank from billingdaddy.billing_stored_details;

Database

Sql sql-style-guide

本文由作者按照 CC BY 4.0 进行授权

1. 命名约定 (Naming Conventions)

1.1. 表名 (Table Names)

1.2. 列别名 (Column Aliases)

1.3. 公用表表达式 (CTE) 命名

2. 代码布局 (Code Layout)

2.1. 大小写 (Capitalization)

2.2. 缩进与对齐 (Indentation and Alignment)

2.3. 逗号位置 (Comma Position)

2.4. 多行查询 (Multi-line Queries)

3. 查询语法 (Query Syntax)

3.1. AS 关键字 (AS Keyword)

3.2. 连接 (Joins)

3.3. 表别名 (Table Aliases)

3.4. 列引用 (Column References)

3.5. WHERE 子句 (WHERE Clause)

3.6. GROUP BY 子句 (GROUP BY Clause)

3.7. IN 子句 (IN Clause)

3.8. CASE 语句 (CASE Statements)

3.9. 公用表表达式 (CTEs)

3.10. 字符串引号 (String Quotes)

3.11. 不等于操作符 (Not-equal Operator)

3.12. 布尔值比较 (Boolean Comparisons)

4. 列选择 (Column Selection)

4.1. 单列查询 (Single Column Queries)

4.2. 列的顺序 (Column Order)

4.3. 窗口函数 (Window Functions)

热门标签