피그마 링크 (암호: aiffel) https://www.figma.com/design/kr6NrmqIG8QTAdVHSYTt6c/Untitled?node-id=0-1&t=iWyFTulKIA4HY72R-1 코랩 링크
공동작업을 위해서는 구글빅쿼리로 작업을 진행하고 데이터마트를 만들어서 서로 공유
장점 : 공동작업과 중간에 업데이트는 쿼리코드만 공유하면 쉽게 데이터나 뷰가 생성된다.
- 주의 : 사전에 이런정보를 공유하지 않으면 개인의 쿼리코드작업후 저장안된건 공유불가능 빅쿼리는 저장안한 코드는 삭제됨.
보안책 : 이럴경우 빅쿼리에서 본인이 업데이트한 뷰나 테이블을 CSV로 저장(빅쿼리 에서 저장할 수 있음) 한 후 공유하면됨
이때 아래 처럼 빅쿼리 공유를 구글이메일로 권한을 줘야 액세스 거부를 사전에 통과할 수 있다.
- 아래 -
Google BigQuery 서비스에서 이 요청에 대한 액세스를 거부했습니다. Access Denied: Table aiffel-454815:ecommerce_mart2.v_event_log: Permission bigquery.tables.get denied on table aiffel-454815:ecommerce_mart2.v_event_log (or it may not exist). Google BigQuery 서버 ''에 연결할 수 없습니다. 서버가 실행 중이며 요청된 데이터베이스에 액세스할 수 있는 권한이 있는지 확인하십시오.
작업시 코드 변경 공유 내용 :
물류팀_A군 : v_logistics_dashboard 컬럼 추가 및 수정 → v_logistics_dashboard_4_view로 수정
- v_logistics_dashboardorder
order_item_id | order_id | product_id | user_id | inventory_created_at | created_at | order_item_shipped_at | order_item_delivered_at | sold_at | order_item_returned_at | distribution_center_id | distribution_center_name | latitude | longitude | city | state | order_item_status | source_type | |
1 | null | null | 13844 | null | 2020-08-16 18:33:00 UTC | null | null | null | null | null | 7 | Philadelphia PA | 39.95 | -75.1667 | null | null | null | inventory |
2 | null | null | 13844 | null | 2024-07-06 01:43:30 UTC | null | null | null | 2024-08-11 11:25:30 UTC | null | 7 | Philadelphia PA | 39.95 | -75.1667 | null | null | null | inventory |
3 | null | null | 13844 | null | 2025-01-25 02:17:00 UTC | null | null | null | null | null | 7 | Philadelphia PA | 39.95 | -75.1667 | null | null | null | inventory |
4 | null | null | 13844 | null | 2021-01-13 04:48:29 UTC | null | null | null | 2021-01-14 23:52:29 UTC | null | 7 | Philadelphia PA | 39.95 | -75.1667 | null | null | null | inventory |
5 | null | null | 13844 | null | 2025-02-03 15:42:00 UTC | null | null | null | null | null | 7 | Philadelphia PA | 39.95 | -75.1667 | null | null | null | inventory |
6 | null | null | 13844 | null | 2020-06-02 17:40:00 UTC | null | null | null | null | null | 7 | Philadelphia PA | 39.95 | -75.1667 | null | null | null | inventory |
7 | null | null | 13844 | null | 2022-10-30 02:53:00 UTC | null | null | null | null | null | 7 | Philadelphia PA | 39.95 | -75.1667 | null | null | null | inventory |
8 | null | null | 13844 | null | 2022-04-04 09:06:00 UTC | null | null | null | null | null | 7 | Philadelphia PA | 39.95 | -75.1667 | null | null | null | inventory |
9 | null | null | 13844 | null | 2021-06-27 16:24:00 UTC | null | null | null | null | null | 7 | Philadelphia PA | 39.95 | -75.1667 | null | null | null | inventory |
10 | null | null | 13844 | null | 2023-04-15 04:39:30 UTC | null | null | null | 2023-05-23 00:31:30 UTC | null | 7 | Philadelphia PA | 39.95 | -75.1667 | null | null | null | inventory |
11 | null | null | 13844 | null | 2024-05-14 16:24:34 UTC | null | null | null | 2024-05-25 12:23:34 UTC | null | 7 | Philadelphia PA | 39.95 | -75.1667 | null | null | null | inventory |
12 | null | null | 13844 | null | 2024-06-22 16:05:19 UTC | null | null | null | 2024-06-30 04:37:19 UTC | null | 7 | Philadelphia PA | 39.95 | -75.1667 | null | null | null | inventory |
13 | null | null | 13844 | null | 2024-09-12 16:54:00 UTC | null | null | null | null | null | 7 | Philadelphia PA | 39.95 | -75.1667 | null | null | null | inventory |
14 | null | null | 13844 | null | 2024-09-25 09:49:04 UTC | null | null | null | 2024-11-18 07:02:04 UTC | null | 7 | Philadelphia PA | 39.95 | -75.1667 | null | null | null | inventory |
15 | null | null | 13844 | null | 2023-06-03 02:41:00 UTC | null | null | null | null | null | 7 | Philadelphia PA | 39.95 | -75.1667 | null | null | null | inventory |
16 | null | null | 13844 | null | 2023-05-06 02:54:00 UTC | null | null | null | null | null | 7 | Philadelphia PA | 39.95 | -75.1667 | null | null | null | inventory |
17 | null | null | 14086 | null | 2023-10-21 11:54:30 UTC | null | null | null | 2023-11-26 01:39:30 UTC | null | 7 | Philadelphia PA | 39.95 | -75.1667 | null | null | null | inventory |
18 | null | null | 14086 | null | 2023-11-03 03:09:33 UTC | null | null | null | 2023-12-21 12:23:33 UTC | null | 7 | Philadelphia PA | 39.95 | -75.1667 | null | null | null | inventory |
19 | null | null | 14086 | null | 2021-04-25 01:51:00 UTC | null | null | null | null | null | 7 | Philadelphia PA | 39.95 | -75.1667 | null | null | null | inventory |
- v_logistics_dashboard_4_view
- 더보기
행 | order_item_id | order_id | product_id | user_id | created_at | product_distribution_center_id | distribution_center_name | latitude | longitude | city | state | country | name | total_orders | delayed_orders | returned_orders | delay_rate | return_rate | delay_to_return_rate | on_time_rate | spi_score | shipped_at | delivered_at | returned_at | delivery_days | delayed |
1 | 33160 | 22948 | 17788 | 18307 | 2022-08-06 10:54:30 UTC | 4 | Los Angeles CA | 34.05 | -118.25 | Zaragoza | Aragón | Spain | Los Angeles CA | 17223 | 1848 | 1753 | 0.10729838007315799 | 0.10178250014515473 | 0.28625541125541126 | 0.892701619926842 | 0.89490797189804339 | null | null | null | null | null |
2 | 5808 | 4006 | 11059 | 3160 | 2025-01-30 09:12:24 UTC | 4 | Los Angeles CA | 34.05 | -118.25 | Riverside | California | United States | Los Angeles CA | 17223 | 1848 | 1753 | 0.10729838007315799 | 0.10178250014515473 | 0.28625541125541126 | 0.892701619926842 | 0.89490797189804339 | null | null | null | null | null |
3 | 60229 | 41535 | 4492 | 33264 | 2019-11-24 22:33:23 UTC | 4 | Los Angeles CA | 34.05 | -118.25 | Titusville | Florida | United States | Los Angeles CA | 17223 | 1848 | 1753 | 0.10729838007315799 | 0.10178250014515473 | 0.28625541125541126 | 0.892701619926842 | 0.89490797189804339 | 2019-11-25 10:09:00 UTC | null | null | null | null |
4 | 83763 | 57727 | 22121 | 46176 | 2023-11-10 06:44:34 UTC | 4 | Los Angeles CA | 34.05 | -118.25 | null | Gyeongsangnam-do | South Korea | Los Angeles CA | 17223 | 1848 | 1753 | 0.10729838007315799 | 0.10178250014515473 | 0.28625541125541126 | 0.892701619926842 | 0.89490797189804339 | null | null | null | null | null |
5 | 133342 | 91994 | 18075 | 73699 | 2022-06-22 10:11:31 UTC | 4 | Los Angeles CA | 34.05 | -118.25 | Tongliao | Hebei | China | Los Angeles CA | 17223 | 1848 | 1753 | 0.10729838007315799 | 0.10178250014515473 | 0.28625541125541126 | 0.892701619926842 | 0.89490797189804339 | 2022-06-23 18:36:00 UTC | null | null | null | null |
6 | 99745 | 68729 | 6494 | 54881 | 2024-01-20 15:37:09 UTC | 4 | Los Angeles CA | 34.05 | -118.25 | Shan Tou | Hubei | China | Los Angeles CA | 17223 | 1848 | 1753 | 0.10729838007315799 | 0.10178250014515473 | 0.28625541125541126 | 0.892701619926842 | 0.89490797189804339 | null | null | null | null | null |
7 | 88444 | 60957 | 27380 | 48791 | 2024-05-11 09:39:26 UTC | 4 | Los Angeles CA | 34.05 | -118.25 | Châtellerault | Nouvelle-Aquitaine | France | Los Angeles CA | 17223 | 1848 | 1753 | 0.10729838007315799 | 0.10178250014515473 | 0.28625541125541126 | 0.892701619926842 | 0.89490797189804339 | null | null | null | null | null |
8 | 92899 | 64048 | 23615 | 51226 | 2024-12-21 06:11:46 UTC | 4 | Los Angeles CA | 34.05 | -118.25 | Annandale | Virginia | United States | Los Angeles CA | 17223 | 1848 | 1753 | 0.10729838007315799 | 0.10178250014515473 | 0.28625541125541126 | 0.892701619926842 | 0.89490797189804339 | null | null | null | null | null |
9 | 101490 | 69967 | 23584 | 55892 | 2024-11-19 07:29:53 UTC | 4 | Los Angeles CA | 34.05 | -118.25 | Maceió | Alagoas | Brasil | Los Angeles CA | 17223 | 1848 | 1753 | 0.10729838007315799 | 0.10178250014515473 | 0.28625541125541126 | 0.892701619926842 | 0.89490797189804339 | 2024-11-21 13:31:00 UTC | null | null | null | null |
10 | 55378 | 38231 | 25966 | 30651 | 2023-06-26 09:11:12 UTC | 4 | Los Angeles CA | 34.05 | -118.25 | Shanghai | Jiangsu | China | Los Angeles CA | 17223 | 1848 | 1753 | 0.10729838007315799 | 0.10178250014515473 | 0.28625541125541126 | 0.892701619926842 | 0.89490797189804339 | null | null | null | null | null |
11 | 65077 | 44868 | 9089 | 35991 | 2020-08-26 06:05:09 UTC | 4 | Los Angeles CA | 34.05 | -118.25 | Sydney | New South Wales | Australia | Los Angeles CA | 17223 | 1848 | 1753 | 0.10729838007315799 | 0.10178250014515473 | 0.28625541125541126 | 0.892701619926842 | 0.89490797189804339 | null | null | null | null | null |
12 | 20002 | 13771 | 27413 | 10954 | 2023-02-15 16:10:49 UTC | 4 | Los Angeles CA | 34.05 | -118.25 | Sin-le-Noble | Hauts-de-France | France | Los Angeles CA | 17223 | 1848 | 1753 | 0.10729838007315799 | 0.10178250014515473 | 0.28625541125541126 | 0.892701619926842 | 0.89490797189804339 | null | null | null | null | null |
13 | 91232 | 62916 | 11425 | 50348 | 2025-03-14 16:41:56 UTC | 4 | Los Angeles CA | 34.05 | -118.25 | Houston | Texas | United States | Los Angeles CA | 17223 | 1848 | 1753 | 0.10729838007315799 | 0.10178250014515473 | 0.28625541125541126 | 0.892701619926842 | 0.89490797189804339 | 2025-03-17 04:40:00 UTC | null | null | null | null |
14 | 165001 | 113969 | 22596 | 91332 | 2025-02-15 11:11:37 UTC | 4 | Los Angeles CA | 34.05 | -118.25 | Huaihua | Beijing | China | Los Angeles CA | 17223 | 1848 | 1753 | 0.10729838007315799 | 0.10178250014515473 | 0.28625541125541126 | 0.892701619926842 | 0.89490797189804339 | null | null | null | null | null |
15 | 131786 | 90919 | 22158 | 72843 | 2025-03-23 01:38:30.087368 UTC | 4 | Los Angeles CA | 34.05 | -118.25 | Busan | Busan | South Korea | Los Angeles CA | 17223 | 1848 | 1753 | 0.10729838007315799 | 0.10178250014515473 | 0.28625541125541126 | 0.892701619926842 | 0.89490797189804339 | 2025-03-21 11:41:18.087368 UTC | null | null | null | null |
16 | 6272 | 4319 | 7944 | 3393 | 2023-11-26 13:04:14 UTC | 4 | Los Angeles CA | 34.05 | -118.25 | Devizes | England | United Kingdom | Los Angeles CA | 17223 | 1848 | 1753 | 0.10729838007315799 | 0.10178250014515473 | 0.28625541125541126 | 0.892701619926842 | 0.89490797189804339 | 2023-11-26 03:28:00 UTC | null | null | null | null |
17 | 179261 | 123800 | 22718 | 99205 | 2024-12-17 16:27:39 UTC | 4 | Los Angeles CA | 34.05 | -118.25 | Sudbury | England | United Kingdom | Los Angeles CA | 17223 | 1848 | 1753 | 0.10729838007315799 | 0.10178250014515473 | 0.28625541125541126 | 0.892701619926842 | 0.89490797189804339 | 2024-12-16 15:09:00 UTC | null | null | null | null |
18 | 74278 | 51151 | 2547 | 40958 | 2021-09-02 05:22:23 UTC | 4 | Los Angeles CA | 34.05 | -118.25 | Lakeland Highlands | Florida | United States | Los Angeles CA | 17223 | 1848 | 1753 | 0.10729838007315799 | 0.10178250014515473 | 0.28625541125541126 | 0.892701619926842 | 0.89490797189804339 | null | null | null | null | null |
19 | 122576 | 84576 | 10622 | 67785 | 2021-12-06 22:44:44 UTC | 4 | Los Angeles CA | 34.05 | -118.25 | Chengdu | Hubei | China | Los Angeles CA | 17223 | 1848 | 1753 | 0.10729838007315799 | 0.10178250014515473 | 0.28625541125541126 | 0.892701619926842 | 0.89490797189804339 | 2021-12-06 04:32:00 UTC | null | null | null | null |
- 컬럼 info
column_name | example_value | inferred_type | |
1 | order_item_id | 33160 | int64 |
2 | order_id | 22948 | int64 |
3 | product_id | 17788 | int64 |
4 | user_id | 18307 | int64 |
5 | created_at | 2022-08-06 10:54:30 UTC | str |
6 | product_distribution_center_id | 4 | int64 |
7 | distribution_center_name | Los Angeles CA | str |
8 | latitude | 34.05 | float64 |
9 | longitude | -118.25 | float64 |
10 | city | Zaragoza | str |
11 | state | Arag처n | str |
12 | country | Spain | str |
13 | name | Los Angeles CA | str |
14 | total_orders | 17223 | int64 |
15 | delayed_orders | 1848 | int64 |
16 | returned_orders | 1753 | int64 |
17 | delay_rate | 0.10729838 | float64 |
18 | return_rate | 0.1017825 | float64 |
19 | delay_to_return_rate | 0.286255411 | float64 |
20 | on_time_rate | 0.89270162 | float64 |
21 | spi_score | 0.894907972 | float64 |
22 | shipped_at | float | |
23 | delivered_at | float | |
24 | returned_at | float | |
25 | delivery_days | float64 | |
26 | delayed | float |
고객팀_B양 : ‘v_user_rfm_segmentation` 뷰 생성 (고객별로 세그멘테이션)
- 뷰 생성 코드
CREATE OR REPLACE VIEW `aiffel-454815.ecommerce_mart2.v_user_rfm_segmentation` AS
WITH base AS (
SELECT
user_id,
MIN(DATE(order_created_at)) AS first_order_date,
MAX(DATE(order_created_at)) AS last_order_date,
COUNT(DISTINCT order_id) AS frequency,
SUM(retail_price * num_of_item) AS total_revenue,
AVG(retail_price * num_of_item) AS aov
FROM `aiffel-454815.ecommerce_mart2.v_marketing_orders`
WHERE status != 'Cancelled'
GROUP BY user_id
),
rfm_step1 AS (
SELECT
*,
DATE_DIFF(DATE('2025-03-27'), last_order_date, DAY) AS recency,
DATE_DIFF(last_order_date, first_order_date, WEEK) + 1 AS active_weeks
FROM base
),
rfm_step2 AS (
SELECT
*,
frequency / NULLIF(active_weeks, 0) AS purchase_freq_per_week,
total_revenue / NULLIF(frequency, 0) AS arpu
FROM rfm_step1
),
rfm_calc AS (
SELECT
*,
aov * purchase_freq_per_week * active_weeks AS raw_ltv,
DATE_DIFF(DATE('2025-03-27'), last_order_date, DAY) >= 60 AS is_churned,
CASE
WHEN DATE_DIFF(DATE('2025-03-27'), last_order_date, DAY) >= 60 THEN
aov * purchase_freq_per_week * active_weeks * 0.5
ELSE
aov * purchase_freq_per_week * active_weeks
END AS ltv,
NTILE(5) OVER (ORDER BY recency ASC) AS r_score,
NTILE(5) OVER (ORDER BY frequency ASC) AS f_score,
NTILE(5) OVER (ORDER BY total_revenue ASC) AS m_score
FROM rfm_step2
),
clustered AS (
SELECT *,
r_score + f_score + m_score AS rfm_score,
CASE
WHEN r_score >= 3 AND f_score >= 3 AND m_score >= 3 AND (r_score + f_score + m_score) >= 10 THEN 'VIP'
WHEN f_score >= 4 THEN '충성고객'
WHEN r_score <= 2 AND f_score <= 2 AND m_score <= 2 THEN '이탈위험고객'
ELSE '일반고객'
END AS cluster
FROM rfm_calc
)
SELECT
user_id,
recency,
frequency,
total_revenue,
aov,
arpu,
purchase_freq_per_week,
active_weeks,
raw_ltv,
is_churned,
ltv,
r_score,
f_score,
m_score,
rfm_score,
cluster
FROM clustered
- (0328) 마트뷰 변경
- 고객팀_세그먼트 View
CREATE OR REPLACE VIEW ecommerce_mart2.v_user_rfm_segmentation AS
-- [1] 유저별 기본 구매 정보 집계 (취소 제외)
WITH base AS (
SELECT
user_id,
MIN(DATE(order_created_at)) AS first_order_date,
MAX(DATE(order_created_at)) AS last_order_date,
COUNT(DISTINCT order_id) AS frequency,
SUM(sale_price * num_of_item) AS total_revenue,
AVG(sale_price * num_of_item) AS aov
FROM ecommerce_mart2.v_marketing_orders
WHERE status != 'Cancelled'
GROUP BY user_id
),
-- [2] Recency, Active Weeks 계산
rfm_step1 AS (
SELECT
*,
DATE_DIFF(DATE('2025-03-27'), last_order_date, DAY) AS recency,
DATE_DIFF(last_order_date, first_order_date, WEEK) + 1 AS active_weeks
FROM base
),
-- [3] 주당 구매 빈도, ARPU, LTV 계산
rfm_step2 AS (
SELECT
*,
frequency / NULLIF(active_weeks, 0) AS purchase_freq_per_week,
total_revenue / NULLIF(frequency, 0) AS arpu
FROM rfm_step1
),
-- [4] LTV 계산 + 이탈 여부 반영 (60일 이상 미접속 시 LTV 50% 감소)
rfm_ltv AS (
SELECT
*,
aov * purchase_freq_per_week * active_weeks AS raw_ltv,
DATE_DIFF(DATE('2025-03-27'), last_order_date, DAY) >= 60 AS is_churned,
CASE
WHEN DATE_DIFF(DATE('2025-03-27'), last_order_date, DAY) >= 60 THEN
aov * purchase_freq_per_week * active_weeks * 0.5
ELSE
aov * purchase_freq_per_week * active_weeks
END AS ltv
FROM rfm_step2
),
-- [5] RFM 점수 부여 (NTILE 사용 시 user_id 중복 방지를 위해 여기서 별도 처리)
rfm_scored AS (
SELECT
*,
NTILE(5) OVER (ORDER BY recency ASC) AS r_score,
NTILE(5) OVER (ORDER BY frequency ASC) AS f_score,
NTILE(5) OVER (ORDER BY total_revenue ASC) AS m_score
FROM rfm_ltv
),
-- [6] 클러스터 정의
clustered AS (
SELECT *,
r_score + f_score + m_score AS rfm_score,
CASE
WHEN r_score >= 3 AND f_score >= 3 AND m_score >= 3 AND (r_score + f_score + m_score) >= 10 THEN 'VIP'
WHEN f_score >= 4 THEN '충성고객'
WHEN r_score <= 2 AND f_score <= 2 AND m_score <= 2 THEN '이탈위험고객'
ELSE '일반고객'
END AS cluster
FROM rfm_scored
)
-- [7] 최종 출력
SELECT
user_id,
first_order_date,
last_order_date,
recency,
frequency AS orders,
total_revenue,
aov,
arpu,
purchase_freq_per_week,
active_weeks,
raw_ltv,
is_churned,
ltv,
r_score,
f_score,
m_score,
rfm_score,
cluster
FROM clustered;
+ f_score + m_score) >= 10 THEN 'VIP' WHEN f_score >= 4 THEN '충성고객' WHEN r_score <= 2 AND f_score <= 2 AND m_score <= 2 THEN '이탈위험고객' ELSE '일반고객' END AS cluster FROM rfm_scored ) -- [7] 최종 출력 SELECT user_id, first_order_date, last_order_date, recency, frequency AS orders, total_revenue, aov, arpu, purchase_freq_per_week, active_weeks, raw_ltv, is_churned, ltv, r_score, f_score, m_score, rfm_score, cluster FROM clustered;
마트 뷰 수정코드
-- 1. 전사 공통 데이터 마트 (Fact + Dim 테이블 구성)
-- 가정: 데이터셋들은 BigQuery에 업로드되어 있고, dataset 이름은 `ecommerce_raw` 라고 가정
-- Fact Table: fact_order_item
CREATE OR REPLACE TABLE ecommerce_mart2.fact_order_item AS
SELECT
oi.id AS order_item_id,
oi.product_id,
oi.order_id,
o.user_id,
-- p.cost,
-- p.category,
-- p.name AS product_name,
-- p.brand,
-- p.retail_price,
-- p.department,
-- p.distribution_center_id,
o.created_at AS order_created_at,
oi.created_at AS order_item_created_at,
oi.status,
oi.shipped_at,
oi.sale_price,
oi.delivered_at AS order_item_delivered_at,
o.delivered_at AS order_deliverd_at,
oi.returned_at,
o.num_of_item
FROM ecommerce_raw.order_items oi
JOIN ecommerce_raw.orders o ON oi.order_id = o.order_id;
-- JOIN ecommerce_raw.product p ON oi.product_id = p.id;
--Fact Table: Inventory_item (재고 분석용)
CREATE OR REPLACE TABLE ecommerce_mart2.fact_inventory_item AS
SELECT
ii.id AS inventory_item_id,
ii.product_id,
ii.created_at AS inventory_created_at,
ii.sold_at,
ii.cost,
ii.product_category AS category,
ii.product_name,
ii.product_brand AS brand,
ii.product_retail_price AS retail_price,
ii.product_department AS department,
ii.product_sku AS sku,
ii.product_distribution_center_id AS distribution_center_id
FROM ecommerce_raw.inventory_items ii;
-- Fact Table: fact_event_log (마케팅팀 세션 분석용)
CREATE OR REPLACE TABLE ecommerce_mart2.fact_event_log AS
SELECT
id AS event_id,
user_id,
session_id,
created_at AS event_created_at,
event_type,
traffic_source,
browser,
city,
state,
postal_code,
uri,
sequence_number
FROM ecommerce_raw.events;
-- Dim Table: dim_user
CREATE OR REPLACE TABLE ecommerce_mart2.dim_user AS
SELECT
id AS user_id,
first_name,
last_name,
email,
age,
gender,
city,
state,
postal_code,
country,
traffic_source,
created_at AS user_created_at
FROM ecommerce_raw.users;
-- Dim Table: dim_product
CREATE OR REPLACE TABLE ecommerce_mart2.dim_product AS
SELECT
id AS product_id,
name,
category,
brand,
department,
sku,
retail_price,
distribution_center_id
FROM ecommerce_raw.products;
-- Dim Table: dim_distribution_center
CREATE OR REPLACE TABLE ecommerce_mart2.dim_distribution_center AS
SELECT
id AS distribution_center_id,
name,
latitude,
longitude
FROM ecommerce_raw.distribution_centers;
-- 2. 팀별 마트용 View 생성 (검사)
-- 운영팀 View
CREATE OR REPLACE VIEW ecommerce_mart2.v_operations_dashboard AS
SELECT
f.*,
u.city,
u.state,
p.category,
p.brand,
p.name AS product_name,
p.retail_price AS product_retail_price,
p.distribution_center_id
FROM ecommerce_mart2.fact_order_item f
LEFT JOIN ecommerce_mart2.dim_user u ON f.user_id = u.user_id
LEFT JOIN ecommerce_mart2.dim_product p ON f.product_id = p.product_id;
-- 마케팅팀 View 1: 주문/고객 중심
CREATE OR REPLACE VIEW ecommerce_mart2.v_marketing_orders AS
SELECT
f.user_id,
f.order_id,
f.order_created_at,
u.traffic_source,
u.gender,
u.age,
u.city,
u.state,
f.status,
f.product_id,
f.returned_at,
f.num_of_item,
p.retail_price,
f.sale_price
FROM `ecommerce_mart.dim_product` p
RIGHT JOIN ecommerce_mart2.fact_order_item f ON p.product_id = f.product_id
LEFT JOIN ecommerce_mart2.dim_user u ON f.user_id = u.user_id
;
-- 마케팅팀 View 2: 세션/이벤트 로그 중심
CREATE OR REPLACE VIEW ecommerce_mart2.v_event_log AS
SELECT
event_id,
user_id,
session_id,
event_created_at,
event_type,
traffic_source,
browser,
city,
state,
postal_code,
uri,
sequence_number
FROM ecommerce_mart2.fact_event_log;
-- 판매팀 View
CREATE OR REPLACE VIEW ecommerce_mart2.v_sales_dashboard AS
SELECT
f.product_id,
p.name AS product_name,
p.category,
p.brand,
p.retail_price,
f.status,
f.returned_at,
f.order_item_created_at,
f.order_item_delivered_at,
f.num_of_item,
f.shipped_at AS order_item_shiped_at,
f.sale_price
FROM ecommerce_mart2.fact_order_item f
LEFT JOIN ecommerce_mart2.dim_product p ON f.product_id = p.product_id;
-- 물류팀 View
CREATE OR REPLACE VIEW ecommerce_mart2.v_logistics_dashboard AS
-- 주문 기반 데이터
SELECT
f.order_item_id,
f.order_id,
f.product_id,
f.user_id,
f.order_item_created_at AS created_at,
f.shipped_at,
f.order_item_delivered_at AS delivered_at,
f.returned_at,
dp.distribution_center_id,
dc.name AS distribution_center_name,
dc.latitude,
dc.longitude,
u.city,
u.state,
f.status,
'order' AS source_type -- 주문 데이터라는 표시
FROM ecommerce_mart2.fact_order_item f
LEFT JOIN ecommerce_mart2.dim_user u ON f.user_id = u.user_id
LEFT JOIN ecommerce_mart2.dim_product dp ON f.product_id = dp.product_id
LEFT JOIN ecommerce_mart2.dim_distribution_center dc ON dp.distribution_center_id = dc.distribution_center_id
UNION ALL
-- 재고 기반 데이터
SELECT
NULL AS order_item_id,
NULL AS order_id,
i.product_id,
NULL AS user_id,
i.inventory_created_at AS created_at,
NULL AS shipped_at,
i.sold_at AS delivered_at, -- 실제 판매 시점이 배송 완료 시점과 유사
NULL AS returned_at,
i.distribution_center_id,
dc.name AS distribution_center_name,
dc.latitude,
dc.longitude,
NULL AS city,
NULL AS state,
NULL AS status,
'inventory' AS source_type -- 재고 데이터라는 표시
FROM ecommerce_mart2.fact_inventory_item i
LEFT JOIN ecommerce_mart2.dim_distribution_center dc ON i.distribution_center_id = dc.distribution_center_id;
-- 고객팀_세그먼트 View
CREATE OR REPLACE VIEW ecommerce_mart2.v_user_rfm_segmentation AS
-- [1] 유저별 기본 구매 정보 집계 (취소 제외)
WITH base AS (
SELECT
user_id,
MIN(DATE(order_created_at)) AS first_order_date,
MAX(DATE(order_created_at)) AS last_order_date,
COUNT(DISTINCT order_id) AS frequency,
SUM(sale_price * num_of_item) AS total_revenue,
AVG(sale_price * num_of_item) AS aov
FROM `aiffel-454815.ecommerce_mart2.v_marketing_orders`
WHERE status != 'Cancelled'
GROUP BY user_id
),
-- [2] Recency, Active Weeks 계산
rfm_step1 AS (
SELECT
*,
DATE_DIFF(DATE('2025-03-27'), last_order_date, DAY) AS recency,
DATE_DIFF(last_order_date, first_order_date, WEEK) + 1 AS active_weeks
FROM base
),
-- [3] 주당 구매 빈도, ARPU, LTV 계산
rfm_step2 AS (
SELECT
*,
frequency / NULLIF(active_weeks, 0) AS purchase_freq_per_week,
total_revenue / NULLIF(frequency, 0) AS arpu
FROM rfm_step1
),
-- [4] LTV 계산 + 이탈 여부 반영 (60일 이상 미접속 시 LTV 50% 감소)
rfm_ltv AS (
SELECT
*,
aov * purchase_freq_per_week * active_weeks AS raw_ltv,
DATE_DIFF(DATE('2025-03-27'), last_order_date, DAY) >= 60 AS is_churned,
CASE
WHEN DATE_DIFF(DATE('2025-03-27'), last_order_date, DAY) >= 60 THEN
aov * purchase_freq_per_week * active_weeks * 0.5
ELSE
aov * purchase_freq_per_week * active_weeks
END AS ltv
FROM rfm_step2
),
-- [5] RFM 점수 부여 (NTILE 사용 시 user_id 중복 방지를 위해 여기서 별도 처리)
rfm_scored AS (
SELECT
*,
NTILE(5) OVER (ORDER BY recency ASC) AS r_score,
NTILE(5) OVER (ORDER BY frequency ASC) AS f_score,
NTILE(5) OVER (ORDER BY total_revenue ASC) AS m_score
FROM rfm_ltv
),
-- [6] 클러스터 정의
clustered AS (
SELECT *,
r_score + f_score + m_score AS rfm_score,
CASE
WHEN r_score >= 3 AND f_score >= 3 AND m_score >= 3 AND (r_score + f_score + m_score) >= 10 THEN 'VIP'
WHEN f_score >= 4 THEN '충성고객'
WHEN r_score <= 2 AND f_score <= 2 AND m_score <= 2 THEN '이탈위험고객'
ELSE '일반고객'
END AS cluster
FROM rfm_scored
)
-- [7] 최종 출력
SELECT
user_id,
last_order_date,
recency,
frequency AS orders,
total_revenue,
aov,
arpu,
purchase_freq_per_week,
active_weeks,
raw_ltv,
is_churned,
ltv,
r_score,
f_score,
m_score,
rfm_score,
cluster
FROM clustered;
세일즈&환불팀_C군
- 세일즈 뷰 변경
-
- 판매팀 View CREATE OR REPLACE VIEW ecommerce_mart2.v_sales_dashboard AS SELECT f.product_id, p.name AS product_name, p.category, f.order_id, f.order_created_at, p.brand, u.city, u.state, ii.cost, u.traffic_source, u.age, p.retail_price, f.status, f.returned_at, f.order_item_created_at AS sales_date, f.order_item_delivered_at, f.num_of_item, f.shipped_at AS order_item_shiped_at, f.sale_price FROM ecommerce_mart2.fact_order_item f LEFT JOIN ecommerce_mart2.dim_product p ON f.product_id = p.product_id LEFT JOIN ecommerce_mart2.dim_user u ON f.user_id = u.user_id LEFT JOIN ecommerce_mart2.fact_inventory_item ii ON f.product_id = ii.product_id;
-
초기 검토 데이터후보
Airbnb Open Data (이탈리아 피렌체) 2안 선택안함
태블로에서 1순위기준 잼있는거 2순위 인터랙티브하고 동적인 대시보드 구현에 최적화된 데이터 목적 : 실제로 여행을 간다는가정하에 필요한 항목이나 정보를 시각화(태블로 시각화) Airbnb Open Data (이탈리아 피렌체) 대시보드 구조
시계열, 지리적, 가격 기반 분석을 포함한 인터랙티브한 대시보드 구현이 가능. 단, 속도와 데이터 정제 측면에서 사전 전처리와 설계가 중요
1. 장점 (Tableau 대시보드 최적화 관점)
데이터 연계성 높음
listings.csv: 숙소 고유 정보
calendar.csv.gz: 날짜별 예약 가능성과 가격
reviews.csv.gz: 리뷰 내용 및 날짜
neighbourhoods.csv: 지리적 구분 → 이들은 listing_id나 neighbourhood 기준으로 서로 연계할 수 있어 분석 단위 통합이 가능함
시계열 분석 가능
calendar.csv: 일자별 가격, 예약 가능 여부 제공 → 가격 추이, 시즌별 분석 가능
reviews.csv: 리뷰 날짜 기반 → 사용자 경험 변화 분석
지리적 시각화 가능
neighbourhoods.csv 및 .geojson → 지도 시각화 지원
listings.csv에는 위도/경도 포함 → Tableau Map View 완벽 지원
사용자 분석이 가능
리뷰 텍스트, 리뷰 개수, 평점 등 고객 반응 기반의 분석 구현 가능
슈퍼호스트, 응답속도 등 호스트 특성도 분석 가능
2. 단점 및 주의할 점
데이터 양이 많아 Tableau 속도 저하 가능성
특히 calendar.csv는 listing × 날짜 수 만큼 레코드 존재 → 데이터 추출(extract) 방식 추천
리뷰 내용은 자연어로 처리 어려움
리뷰 텍스트는 Tableau에서 직접 분석 어려움 → 사전 텍스트 전처리 필요
결측치 및 형식 불일치 가능성
일부 열에 결측값 많을 수 있음 (ex: 호스트 응답률, 편의시설)
날짜 포맷, 숫자 포맷이 locale에 따라 깨질 수 있음 → 사전 정제 필요
시간 범위 통일 필요
각 데이터셋의 기준 날짜가 상이할 수 있음 → 연도/월 필터는 사전 정의 필요
3. 추가 추천 파생변수 (계산 필드 또는 파생 필드)
Tableau에서 인터랙티브 분석 강화를 위해 파생변수 추가추천
price_cleaned (문자열에서 숫자 추출: $120 → 120)
is_available (calendar 데이터의 available → True/False 처리)
review_per_month (리뷰 수 / 운영 개월 수)
weekday (calendar의 날짜에서 요일 추출)
season (월에 따라 성수기/비수기 분류)
stay_category (숙박 일 수에 따라 단기/중기/장기 분류)
Airbnb Open Data 대시보드 구조
목적: 숙소 유형별 트렌드 분석 및 사용자 맞춤 숙소 필터링
주요 시각화 요소
- 숙소 개수 및 평균 가격 KPI
- 숙소 위치 맵 (지도 시각화)
- 가격 범위에 따라 색상 또는 점 크기 조절
- 숙소 유형별 가격 분포 (박스플롯 또는 막대그래프)
- Entire home, Private room 등
- 월별 리뷰 수 추이 (시계열)
- 호스트 상위 Top 10 및 숙소 수 (가로 막대)
인터랙티브 기능
- 슬라이서: 가격 범위, 숙소 유형, 지역 선택
- 지도 클릭 시 해당 위치의 상세 정보 필터링
- 날짜 필터: 연도/월별 선택
- 리뷰 수 기준 정렬 기능
- https://insideairbnb.com/florence/
- ▶ 이유 : 가격데이터에 대한 세분화가 안되어있어 수익, 지출, 등 회사입장에서 인사이트 도출에 한계가 있음
이렇게 빅쿼리에서 서로 데이터에 대한 기본 데이터는 아래로 정함.
Looker Ecommerce BigQuery Dataset 사
https://www.kaggle.com/datasets/mustafakeser4/looker-ecommerce-bigquery-dataset
Looker Ecommerce BigQuery Dataset
CSV version of BigQuery Looker Ecommerce Dataset
www.kaggle.com
프로세스
데이터 선정 → EDA → 데이터 마트 구축 → 대시보드 설계 → 대시보드 구축
'데이터 분석가:Applied Data Analytics' 카테고리의 다른 글
Main Quest06_Project_Tableau 태블로 시각화 6-3 (0) | 2025.03.30 |
---|---|
Main Quest06_Project_Tableau 태블로 시각화 6-2 (0) | 2025.03.30 |
Power BI 란? (0) | 2025.03.25 |
태블로 시각화 프로젝트 예시 (0) | 2025.03.25 |
데이터 시각화-Tableau(태블로) 대시보드 (0) | 2025.03.25 |