# 力扣专项突破SQL **Repository Path**: hongmiaoliao/LeetCodeSQL ## Basic Information - **Project Name**: 力扣专项突破SQL - **Description**: 在做题过程中记录的专项突破SQL答案及过程,包括SQL入门、SQL基础、SQL进阶。使用MySQL语法,许多题目使用了窗口函数简化解题方法 - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 2 - **Forks**: 0 - **Created**: 2023-01-22 - **Last Updated**: 2023-05-14 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # (专项突破SQL-SQL入门) [leetcode页面](https://leetcode.cn/study-plan/sql/?progress=4oysn33) [toc] ## 选择 ### 595 大的国家 方法一:使用where和or ```SQL select name, population, area from World where area >= 3000000 or population >= 25000000; ``` 方法二:使用where子句和UNION 使用or会使索引失效,在数据量大的时候查找效率较低,通常建议使用union代替or ```SQL select name, population, area from world where area >= 3000000 union select name, population, area from world where population >= 25000000; ``` ### 1757 可回收且低脂的产品 方法:使用where和and,sql等号是一个等号 = ```SQL select product_id from Products where low_fats = 'Y' and recyclable = 'Y'; ``` ### 584 寻找用户推荐人 方法 tips: * NULL不是值,所以不能对其使用谓词,如果使用谓词,结果是unknown。 * SQL的保留字中,很多都被归类为谓词一类,例如>,<>,= 等比较谓词,以及BETWEEN, LIKE, IN, IS NULL等。总结,谓词是一种特殊的函数,返回值是真值。(前面提到的诶个谓词,返回值都是true, false, unknown,SQL是三值逻辑,所以有三种真值)。 * 因为查询结果只会包含WHERE子句里的判断结果为true的行!不包含判断结果为false和unknown的行。且不仅是等号,对NULL使用其他比较谓词(比如> NULL),结果也都是unknown。 记忆:优先级: AND: false > unknown > true OR: true > unknown > false 方法一: 使用or或联合(联合需要加all), ```SQL # select name from customer where referee_id != 2 or referee_id is null; select name from customer where referee_id != 2 union all select name from customer where referee_id is null; ``` 方法二:使用not in ```SQL select name from customer where id not in (select id from customer where referee_id = 2); ``` ### 183 从不订购的客户 方法:使用not in语句 ```SQL select Name Customers from Customers C where C.Id not in (select CustomerId from Orders); ``` ## 排序&修改 ### 1873 计算特殊奖金 方法一:使用if,条件也可以name like 'M%' ```SQL select employee_id, if(left(name, 1) = 'M' or mod(employee_id, 2) = 0, 0, salary) as bonus from Employees order by employee_id; ``` 方法二:使用CASE WHEN ... THEN ... else .. ```SQL SELECT employee_id, CASE WHEN MOD(employee_id, 2) = 1 AND left(name, 1) != 'M' THEN salary ELSE 0 END AS bonus FROM Employees ORDER BY employee_id; ``` ### 627 变更性别 方法一:使用if ```SQL update Salary set sex = if(sex = 'f', 'm', 'f'); ``` 方法二:使用case when ```SQL UPDATE salary SET sex = CASE sex WHEN 'm' THEN 'f' ELSE 'm' END; ``` ### 196 删除重复的电子邮箱 方法一:不能从select里面删除,需要再创建一个临时表,于是有 ```SQL delete from Person where id not in (select * from (select min(id) from Person group by email) as p); ``` 方法二:使用自连接 ```SQL delete p1 from Person p1, Person p2 where p1.email = p2.email && p1.id > p2.id; ``` ## 字符串处理函数/正则 字符串拼接函数 * CONCAT(s1, s2,...,sn),将字符串s1,s2...,sn拼接成一个字符串 * left(str, n),从字符串左边截取n个字符 * substring(str, n),截取从字符串第n个字符到结尾,substring(str,n,len) 截取从n开始的len个字符 * lower(str)全部变为小写,upper(str)全部变为大写 * group_concat(),和group by相关,将同一个分组的值连接起来,返回一个字符串结果 ### 1667 修复表中的名字 方法:使用字符串拼接函数 ```SQL select user_id, concat(upper(left(name, 1)), lower(substring(name, 2))) name from Users order by user_id; ``` ### 1484 按日期分组销售产品 方法: 使用group by 配合group_concat函数,注意需要用distinct去重 ```SQL select sell_date, count(distinct product) num_sold, group_concat(distinct product) products from Activities group by sell_date order by sell_date; ``` ### 1527 患某种疾病的患者 方法一:使用or判断两种情况 ```SQL select * from Patients where conditions like 'DIAB1%' or conditions like '% DIAB1%'; ``` 方法二:使用正则表达式 ```SQL select * from Patients where conditions regexp '^DIAB1|.*\\sDIAB1'; # \\s为空格 ``` ## 组合查询&指定选取 ### 1965 丢失信息的雇员 方法:先将两个表的id进行并集,这个集合如果id只出现一次,说明在某一方缺失。只出现一次的id就是答案。 ```SQL select employee_id from (select employee_id from Employees union all select employee_id from Salaries ) as tmp group by employee_id having count(employee_id) = 1 order by employee_id; ``` ### 1795 每个产品在不同商店的价格 方法一:行转列,使用union联合, ```SQL select product_id, "store1" store, store1 price from Products where store1 is not null union select product_id, "store2" store, store2 price from Products where store2 is not null union select product_id, "store3" store, store3 price from Products where store3 is not null; ``` 行转列,列转行,使用group by + if, 由于group by需要配合上聚合函数,所以对于只有一个值的分组,用sum,max,min都可以。是上面的逆操作 ```SQL SELECT product_id, SUM(IF(store = 'store1', price, NULL)) store1, SUM(IF(store = 'store2', price, NULL)) store2, SUM(IF(store = 'store3', price, NULL)) store3 FROM Products1 GROUP BY product_id ; ``` ### 608 树节点 方法一:使用两重if,先判断是否是根节点,再判断是否是叶子结点 ```SQL select id, if(p_id is null, "Root", if(id in (select p_id from tree), "Inner", "Leaf")) type from tree; ``` 方法二:使用case when流程控制语句 ```SQL select id, case when id = (select id from tree where p_id is null) then "Root" when id in (select p_id from tree) then "Inner" else "Leaf" end type from tree; ``` ### 176 第二高的薪水 注意点:需要去重distinct,要考虑没有第k大的值需要返回null 方法一:使用order by + limit offset,再多一重select可以在没有第k大(offset k-1)时返回null ```SQL select ( select distinct salary from Employee order by salary desc limit 1 offset 1 ) SecondHighestSalary; ``` 方法二:ifnull函数和LIMIT子句 ```SQL select ifnull( (select distinct salary from Employee order by salary desc limit 1 offset 1), null) as SecondHighestSalary; ``` ## 合并 ### 175 组合两个表 方法:使用外连接outer join, 左边所有信息加上左边和右边的交集部分数据 ```SQL select firstName, lastName, city, state from Person left join Address on Person.personId = Address.PersonId; ``` ### 1581 进店却未进行交易过的顾客 方法一:找到Visits表中visit_id不在Transactions表中的customer_id,说明这些人没有购物过,在进行分组查询计数 ```SQL select customer_id, count(customer_id) count_no_trans from Visits where visit_id not in ( select visit_id from Transactions ) group by customer_id; ``` 方法二:使用外连接查询,Visits左连接Transactions,查询出连接后字段为null的行,就是在Transactions没有交易记录的行,即没有交易的客户,最后进行分组计算 ```SQL select customer_id, count(customer_id) count_no_trans from Visits left join Transactions on Visits.visit_id = Transactions.visit_id where amount is null group by(customer_id); ``` ### 1148 文章浏览Ⅰ 方法一:直接查询+distinct去重 ```SQL select distinct author_id id from Views where author_id = viewer_id order by id; ``` ### 197 上升的温度 DATEDIFF(date1, date2):返回起始时间date1和结束时间date2之间的天数, date1 - date2, 方法一:使用隐式内连接 ```SQL select w1.id from Weather w1, Weather w2 where datediff(w1.recordDate, w2.recordDate) = 1 and w1.Temperature > w2.Temperature; ``` 方法二:使用显式内连接 ```SQL select a.id from weather a join weather b on datediff(a.recordDate, b.recordDate) = 1 where a.Temperature > b.Temperature; ``` ### 607 销售员 方法:先对Company和Orders做内连接,找到买过RED的人员的id,在查询id不等于买过RED人员id的人 ```SQL select s.name from SalesPerson s where s.sales_id not in ( select o.sales_id from Orders o join Company c on o.com_id = c.com_id where c.name = "RED" ); ``` ## 计算函数 ### 1141 查询近30天活跃用户数 注意点,一个用户可能在一天有多次访问,所以count user_id时需要加入distinct 方法一:使用datediff函数,然后分组计数 ```SQL select activity_date day, count(distinct user_id) active_users from Activity where datediff("2019-07-27", activity_date) between 0 and 29 group by activity_date; ``` 方法二:直接使用between ```SQL select activity_date day, count(distinct user_id) active_users from Activity where activity_date between "2019-06-28" and "2019-07-27" group by activity_date; ``` ### 1693 每天的领导和合伙人 方法:分组去重统计 ```SQL select date_id, make_name, count(distinct lead_id) unique_leads, count(distinct partner_id) unique_partners from DailySales group by date_id, make_name; ``` ### 1729 求关注者的数量 方法:分组统计 ```SQL select user_id, count(distinct follower_id) followers_count from Followers group by user_id; ``` ### 586 订单最多的客户 方法一:使用分组将计算订单数,再对订单数进行降序排序,用limit 1取排序后第一个,这样对只有一个最多的答案有效 ```SQL select customer_number from Orders group by customer_number order by count(*) desc limit 1; ``` 方法二:对于多行并列多的数据,可以先查询出最多的订单数,再分组进行匹配进行。 ```SQL select customer_number from Orders group by customer_number having count(*) = ( select count(customer_number) cnt from Orders group by customer_number order by cnt desc limit 1 ); ``` ### 511 游戏玩法分析 Ⅰ 方法:使用聚合函数对分组查询的分组进行操作 ```SQL select player_id, min(event_date) first_login from Activity group by player_id; ``` ### 1890 2020年最后一次登录 方法:找到2020年最晚的一天即可,可以用year函数选出再2020年的行,然后再进行分组+max聚合函数 ```SQL select user_id, max(time_stamp) last_stamp from Logins where year(time_stamp) = "2020" group by user_id; ``` ### 1741 查找每个员工花费的总时间 方法:使用分组查询对id和日期进行分组(即一人一天),使用聚合函数sum计算总时间 ```SQL select event_day "day", emp_id, sum(out_time - in_time) total_time from Employees group by emp_id, event_day ``` ## 控制流 case when的两种语句 ```SQL # 语句一 CASE input_expression WHEN expression1 THEN result_expression1 WHEN expression2 THEN result_expression2 [...n] ELSE result_expression END ``` ```SQL # case语句二 CASE WHEN expression1 THEN result_expression1 WHEN expression2 THEN result_expression2 [...n] ELSE result_expression END ``` * COALESCE是一个函数, (expression_1, expression_2, ...,expression_n)依次参考各参数表达式,遇到非null值即停止并返回该值。如果所有的表达式都是空值,最终将返回一个空值。使用COALESCE在于大部分包含空值的表达式最终将返回空值。 * 外连接时要注意where和on的区别,on是在连接构造临时表时执行的,不管on中条件是否成立都会返回主表(也就是left join左边的表)的内容,where是在临时表形成后执行筛选作用的,不满足条件的整行都会被过滤掉 ### 1393 股票的资本损益 方法:对股票名称进行分组后,使用聚合函数sum,使用if或when case将买的价格设定为负,卖的价格定位正,进行求和 ```SQL select stock_name, sum( case operation when "Buy" then -price else price end ) capital_gain_loss capital_gain_loss from Stocks group by stock_name; ``` ```SQL select stock_name, sum(if(operation="Buy", -price, price)) capital_gain_loss from Stocks group by stock_name; ``` ### 1407 排名靠前的旅行者 方法:利用左连接对两表拼接,然后使用分组+sum聚合函数进行统计,对统计还需要null的判断,可以用ifnull或coalesce函数 注意这里要group by id而不是name,因为一个名字可能不只一个id ```SQL select u.name, ifnull(sum(r.distance),0) travelled_distance from Users u left join Rides r on u.id = r.user_id group by u.id order by travelled_distance desc, name; ``` ### 1158 市场分析 Ⅰ 方法一:使用左外连接进行拼接,在连接时使用on,能查询出为空的字段。然后使用分组加count聚合求出总数 ```SQL select u.user_id buyer_id, u.join_date, count(o.order_id) orders_in_2019 from Users u left join Orders o on u.user_id = o.buyer_id and year(o.order_date) = "2019" group by u.user_id; ``` 方法二:先用分组+count对Orders表查询出购买者id和购买数量,此时是不包含为null的内容,以此作为临时表,和Users表作左连接。 ```SQL select u.user_id buyer_id, u.join_date, ifnull(t.cnt,0) orders_in_2019 from Users u left join ( select buyer_id, count(*) cnt from Orders where year(order_date) = "2019" group by buyer_id ) t on u.user_id = t.buyer_id; ``` ## 过滤 ### 182 查找重复的电子邮箱 方法:使用group by + having子语句,用count聚合函数过滤掉出现次数小于1的 ```SQL select Email from Person group by Email having count(*) >= 2; ``` ### 1050 合作过至少三次的演员和导演 方法:直接group by对两个字段分组,再用having count进行计数过滤 ```SQL select actor_id, director_id from ActorDirector group by actor_id, director_id having count(*) >= 3; ``` ### 1587 银行账户概要 Ⅱ 方法:将用户表和交易表进行左外连接,然后进行分组统计,用sum聚合函数过滤 ```SQL select u.name NAME, sum(t.amount) BALANCE from Users u left join Transactions t on u.account = t.account group by u.account having sum(t.amount) > 10000; ``` ### 1084 销售分析 Ⅲ 方法:先进行左外连接,然后进行分组过滤 注意,having语句不能使用between,应该用聚合函数进行过滤 ```SQL select p.product_id, p.product_name from Product p left join Sales s on p.product_id = s.product_id group by p.product_id having max(s.sale_date) <= "2019-03-31" and min(s.sale_date) >= "2019-01-01"; ```