Hive学习笔记——常用SQL

2016-05-19

tonglin0325

1.查询第二高的值
2.查询连续出现3次的数字
3.查询每个部门下最高的值
4.将分数转换成排名

1.查询第二高的值#

输入：Salary表

+-------------+------+
| Column Name | Type |
+-------------+------+
| id          | int  |
| salary      | int  |
+-------------+------+

使用limit+offset语法来限制结果数量，其中 limit N,1 等于 limit 1 offset N

1 2	select (select DISTINCT Salary from Employee order by Salary DESC limit 1 offset 1) as SecondHighestSalary

输出：

+---------------------+
| SecondHighestSalary |
+---------------------+
| 200                 |
+---------------------+

2.查询连续出现3次的数字#

输入：Logs表

+----+-----+
| id | num |
+----+-----+
| 1  | 1   |
| 2  | 1   |
| 3  | 1   |
| 4  | 2   |
| 5  | 1   |
| 6  | 2   |
| 7  | 2   |
+----+-----+

查询表3次，然后通过where条件来筛选

SELECT DISTINCT(t1.Num) as ConsecutiveNums
FROM Logs t1, Logs t2, Logs t3 
WHERE
    t1.Id = t2.Id - 1
    AND t2.Id = t3.Id - 1
    AND t1.Num = t2.Num
    AND t2.Num = t3.Num
;

输出：

+-----------------+
| ConsecutiveNums |
+-----------------+
| 1               |
+-----------------+

3.查询每个部门下最高的值#

输入：Employee表和Department表

Employee 表:
+----+-------+--------+--------------+
| id | name  | salary | departmentId |
+----+-------+--------+--------------+
| 1  | Joe   | 70000  | 1            |
| 2  | Jim   | 90000  | 1            |
| 3  | Henry | 80000  | 2            |
| 4  | Sam   | 60000  | 2            |
| 5  | Max   | 90000  | 1            |
+----+-------+--------+--------------+
Department 表:
+----+-------+
| id | name  |
+----+-------+
| 1  | IT    |
| 2  | Sales |
+----+-------+

先用group by+max算出每个departmentId下最多的salary，然后用where+in来进行过滤

select t2.Name as Department,t1.Name as Employee,Salary  from (
(select Name, DepartmentId,Salary from Employee) t1
left join 
(select Id,Name from Department) t2
on t1.DepartmentId = t2.Id
)
WHERE
(t2.Id,t1.Salary)
in
(
(select DepartmentId, MAX(Salary) from Employee
group by DepartmentId)
)

4.将分数转换成排名#

输入：Scores表

+----+-------+
| id | score |
+----+-------+
| 1  | 3.50  |
| 2  | 3.65  |
| 3  | 4.00  |
| 4  | 3.85  |
| 5  | 4.00  |
| 6  | 3.65  |
+----+-------+

# t2表中有多少个大于t1当前score的
select Score, 
(select count(distinct(Score)) from Scores t1 where t1.Score > t2.Score ) +1 as `Rank` 
from Scores t2
order by Score DESC;

去重后的scores表中，比scores表的每一行大的有多少个

输出：

+-------+------+
| score | rank |
+-------+------+
| 4.00  | 1    |
| 4.00  | 1    |
| 3.85  | 2    |
| 3.65  | 3    |
| 3.65  | 3    |
| 3.50  | 4    |
+-------+------+

tonglin0325的个人主页