# hive-stat **Repository Path**: zhouxingjia/hive-stat ## Basic Information - **Project Name**: hive-stat - **Description**: hive统计sql练习 - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 1 - **Forks**: 0 - **Created**: 2022-04-02 - **Last Updated**: 2022-09-13 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README #题目一(简单) #展示电影 ID 为 2116 这部电影各年龄段的平均影评分。 select u.age AS age, avg(r.rate) AS avgrate from t_rating AS r inner join t_user AS u on (r.userid = u.userid) where r.movieid = 2116 group by u.age order by u.age; ![img_6.png](img_6.png) #题目二(中等) #找出男性评分最高且评分次数超过 50 次的 10 部电影,展示电影名,平均影评分和评分次数。 select collect_set(u.sex)[0] AS sex, collect_set(m.moviename)[0] AS name, avg(r.rate) AS avgrate, count(1) AS total from t_rating AS r inner join t_user AS u on (r.userid = u.userid) inner join t_movie AS m on (r.movieid = m.movieid) where u.sex = 'M' group by r.movieid having count(1) > 50 order by avg(r.rate) desc limit 10; ![img_5.png](img_5.png) #题目三(选做) #找出影评次数最多的女士所给出最高分的 10 部电影的平均影评分,展示电影名和平均影评分(可使用多行 SQL)。 select collect_set(um.moviename)[0], avg(ur.rate) from t_rating AS ur inner join t_movie AS um on (ur.movieid = um.movieid) where ur.movieid IN ( select movieid from t_rating AS r where r.userid = ( select userid from t_rating where userid IN ( select userid from t_user where sex = 'F' ) group by userid order by count(1) desc limit 1 ) order by r.rate desc limit 10 ) group by ur.movieid; ![img_4.png](img_4.png) #数据解释 t_user 观众表(6000+ 条数据) 字段为:UserID, Sex, Age, Occupation, Zipcode 字段中文解释:用户 id,性别,年龄,职业,邮编 t_movie 电影表(共 3000+ 条数据) 字段为:MovieID, MovieName, MovieType 字段中文解释:电影 ID,电影名,电影类型 t_rating 影评表(100 万 + 条数据) 字段为:UserID, MovieID, Rate, Times 字段中文解释:用户 ID,电影 ID,评分,评分时间