유니온 파인드 (Union-Find)

개념

유니온 파인드 (Disjoint Set) - 서로소 집합을 관리하는 자료구조

**합집합(Union)**과 찾기(Find) 연산을 효율적으로 처리

시간복잡도

연산	시간복잡도	최적화 전
Find	`O(α(n))`	`O(n)`
Union	`O(α(n))`	`O(n)`

α(n): 아커만 함수의 역함수 (거의 상수, 매우 작음)

특징

주요 연산

Find(x): x가 속한 집합의 대표 찾기
Union(x, y): x와 y가 속한 집합을 합치기
Connected(x, y): x와 y가 같은 집합인지 확인

장점

매우 빠른 연산: 거의 O(1)
간단한 구현: 배열 하나로 가능
공간 효율적: O(n)

단점

집합 분리 불가: 한번 합친 집합은 분리 불가
순서 정보 없음: 집합 내 원소 순서 관리 X

Python 구현

1. 기본 구현

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
 
    def find(self, x):
        if self.parent[x] != x:
            return self.find(self.parent[x])
        return x
 
    def union(self, x, y):
        root_x = self.find(x)
        root_y = self.find(y)
 
        if root_x != root_y:
            self.parent[root_y] = root_x
 
    def connected(self, x, y):
        return self.find(x) == self.find(y)
 
# 테스트
uf = UnionFind(5)
uf.union(0, 1)
uf.union(1, 2)
print(uf.connected(0, 2))  # True
print(uf.connected(0, 3))  # False
 
# 시간복잡도: O(n) - 최악의 경우

2. 경로 압축 (Path Compression)

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
 
    def find(self, x):
        # 경로 압축: 루트를 직접 부모로 설정
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]
 
    def union(self, x, y):
        root_x = self.find(x)
        root_y = self.find(y)
 
        if root_x != root_y:
            self.parent[root_y] = root_x
 
    def connected(self, x, y):
        return self.find(x) == self.find(y)
 
# 테스트
uf = UnionFind(5)
uf.union(0, 1)
uf.union(1, 2)
uf.union(2, 3)
print(uf.connected(0, 3))  # True
 
# 시간복잡도: O(log n) - 경로 압축으로 개선

3. Union by Rank (랭크 기반 합치기)

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
 
    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]
 
    def union(self, x, y):
        root_x = self.find(x)
        root_y = self.find(y)
 
        if root_x != root_y:
            # 랭크가 작은 트리를 큰 트리에 합침
            if self.rank[root_x] < self.rank[root_y]:
                self.parent[root_x] = root_y
            elif self.rank[root_x] > self.rank[root_y]:
                self.parent[root_y] = root_x
            else:
                self.parent[root_y] = root_x
                self.rank[root_x] += 1
 
    def connected(self, x, y):
        return self.find(x) == self.find(y)
 
# 시간복잡도: O(α(n)) - 거의 상수

4. 집합 크기 추적

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.size = [1] * n
 
    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]
 
    def union(self, x, y):
        root_x = self.find(x)
        root_y = self.find(y)
 
        if root_x != root_y:
            # 작은 집합을 큰 집합에 합침
            if self.size[root_x] < self.size[root_y]:
                self.parent[root_x] = root_y
                self.size[root_y] += self.size[root_x]
            else:
                self.parent[root_y] = root_x
                self.size[root_x] += self.size[root_y]
 
    def get_size(self, x):
        return self.size[self.find(x)]
 
    def connected(self, x, y):
        return self.find(x) == self.find(y)
 
# 테스트
uf = UnionFind(5)
uf.union(0, 1)
uf.union(1, 2)
print(uf.get_size(0))  # 3
print(uf.get_size(3))  # 1

자주 나오는 문제 유형

1. 연결 요소 개수 세기

문제: 그래프의 연결 요소 개수 구하기

def count_components(n, edges):
    uf = UnionFind(n)
 
    for u, v in edges:
        uf.union(u, v)
 
    # 대표 노드 개수 세기
    return len(set(uf.find(i) for i in range(n)))
 
# 테스트
n = 5
edges = [(0, 1), (1, 2), (3, 4)]
print(count_components(n, edges))  # 2
 
# 시간복잡도: O(E × α(n)) - E: 간선 개수

2. 사이클 감지

문제: 무방향 그래프에서 사이클이 있는지 확인

def has_cycle(n, edges):
    uf = UnionFind(n)
 
    for u, v in edges:
        # 이미 같은 집합이면 사이클
        if uf.connected(u, v):
            return True
        uf.union(u, v)
 
    return False
 
# 테스트
n = 4
edges = [(0, 1), (1, 2), (2, 0)]
print(has_cycle(n, edges))  # True
 
edges = [(0, 1), (1, 2), (2, 3)]
print(has_cycle(n, edges))  # False
 
# 시간복잡도: O(E × α(n))

3. 크루스칼 알고리즘 (MST)

문제: 최소 신장 트리 찾기

def kruskal_mst(n, edges):
    """
    edges: [(weight, u, v), ...]
    """
    # 가중치 기준 정렬
    edges.sort()
 
    uf = UnionFind(n)
    mst = []
    total_weight = 0
 
    for weight, u, v in edges:
        # 사이클이 생기지 않으면 추가
        if not uf.connected(u, v):
            uf.union(u, v)
            mst.append((u, v, weight))
            total_weight += weight
 
    return mst, total_weight
 
# 테스트
n = 4
edges = [
    (1, 0, 1),
    (2, 1, 2),
    (3, 2, 3),
    (4, 3, 0),
    (5, 0, 2)
]
mst, weight = kruskal_mst(n, edges)
print(f"MST: {mst}, Total Weight: {weight}")
# MST: [(0, 1, 1), (1, 2, 2), (2, 3, 3)], Total Weight: 6
 
# 시간복잡도: O(E log E)

4. 친구 네트워크

문제: 친구 관계를 추가하며 각 네트워크 크기 출력

def friend_network(n, relationships):
    uf = UnionFind(n)
    result = []
 
    for u, v in relationships:
        uf.union(u, v)
        # 합친 후 네트워크 크기
        result.append(uf.get_size(u))
 
    return result
 
# 테스트
n = 6
relationships = [(0, 1), (1, 2), (3, 4), (0, 3)]
print(friend_network(n, relationships))
# [2, 3, 2, 5]
 
# 시간복잡도: O(n × α(n))

5. 방 번호

문제: 빈 방 찾기 (Union-Find 응용)

class RoomAllocator:
    def __init__(self, n):
        # parent[i]: i번 방이 꽉 차면 다음 빈 방 번호
        self.parent = list(range(n + 1))
 
    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]
 
    def allocate(self, room):
        # 빈 방 찾기
        empty_room = self.find(room)
        # 다음 방으로 포인터 이동
        self.parent[empty_room] = empty_room + 1
        return empty_room
 
# 테스트
allocator = RoomAllocator(10)
print(allocator.allocate(3))  # 3
print(allocator.allocate(3))  # 4
print(allocator.allocate(3))  # 5
print(allocator.allocate(1))  # 1
print(allocator.allocate(1))  # 2
 
# 시간복잡도: O(α(n))

6. 계정 병합

문제: 이메일 주소로 같은 사람의 계정 병합

def accounts_merge(accounts):
    """
    accounts: [["John", "a@mail.com", "b@mail.com"], ...]
    """
    email_to_id = {}
    email_to_name = {}
    id_counter = 0
 
    # 이메일에 고유 ID 할당
    for account in accounts:
        name = account[0]
        for email in account[1:]:
            if email not in email_to_id:
                email_to_id[email] = id_counter
                email_to_name[email] = name
                id_counter += 1
 
    # Union-Find 초기화
    uf = UnionFind(id_counter)
 
    # 같은 계정의 이메일 합치기
    for account in accounts:
        first_email_id = email_to_id[account[1]]
        for email in account[2:]:
            uf.union(first_email_id, email_to_id[email])
 
    # 결과 그룹화
    groups = {}
    for email, email_id in email_to_id.items():
        root = uf.find(email_id)
        if root not in groups:
            groups[root] = []
        groups[root].append(email)
 
    # 결과 포맷팅
    result = []
    for group in groups.values():
        name = email_to_name[group[0]]
        result.append([name] + sorted(group))
 
    return result
 
# 테스트
accounts = [
    ["John", "a@mail.com", "b@mail.com"],
    ["John", "c@mail.com"],
    ["John", "a@mail.com", "c@mail.com"]
]
print(accounts_merge(accounts))
# [['John', 'a@mail.com', 'b@mail.com', 'c@mail.com']]
 
# 시간복잡도: O(N × α(N)) - N: 총 이메일 개수

핵심 요약

개념	설명
유니온 파인드	서로소 집합 자료구조
주요 연산	Find, Union, Connected
최적화	경로 압축 + Union by Rank
시간복잡도	O(α(n)) - 거의 상수

최적화 기법

1. 경로 압축 (Path Compression)

def find(self, x):
    if self.parent[x] != x:
        # 재귀 중 모든 노드의 부모를 루트로 업데이트
        self.parent[x] = self.find(self.parent[x])
    return self.parent[x]

효과: 트리 높이 감소 → Find 연산 최적화

2. Union by Rank

def union(self, x, y):
    root_x = self.find(x)
    root_y = self.find(y)
 
    if root_x != root_y:
        if self.rank[root_x] < self.rank[root_y]:
            self.parent[root_x] = root_y
        elif self.rank[root_x] > self.rank[root_y]:
            self.parent[root_y] = root_x
        else:
            self.parent[root_y] = root_x
            self.rank[root_x] += 1

효과: 항상 낮은 트리를 높은 트리에 합침 → 높이 최소화

3. Union by Size

def union(self, x, y):
    root_x = self.find(x)
    root_y = self.find(y)
 
    if root_x != root_y:
        if self.size[root_x] < self.size[root_y]:
            self.parent[root_x] = root_y
            self.size[root_y] += self.size[root_x]
        else:
            self.parent[root_y] = root_x
            self.size[root_x] += self.size[root_y]

효과: 작은 집합을 큰 집합에 합침 → 균형 유지

활용 분야

1. 그래프 알고리즘

연결 요소 찾기
사이클 감지
최소 신장 트리 (크루스칼)

2. 네트워크 문제

친구 관계
계정 병합
소셜 네트워크 분석

3. 이미지 처리

연결된 영역 찾기
이미지 세그멘테이션

4. 게임 개발

영역 연결
팀 그룹화

주의사항

1. 초기화

# n개의 원소를 각각 독립된 집합으로 초기화
self.parent = list(range(n))

2. Find 최적화 필수

# 경로 압축 없으면 O(n) 시간복잡도
# 반드시 경로 압축 사용
if self.parent[x] != x:
    self.parent[x] = self.find(self.parent[x])

3. Union 전 Find

# Union 시 반드시 Find로 루트 찾기
root_x = self.find(x)
root_y = self.find(y)

4. 집합 분리 불가

# 한번 합친 집합은 분리 불가
# 필요하면 새로 초기화

유니온 파인드 템플릿

기본 템플릿

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
 
    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]
 
    def union(self, x, y):
        root_x = self.find(x)
        root_y = self.find(y)
 
        if root_x != root_y:
            if self.rank[root_x] < self.rank[root_y]:
                self.parent[root_x] = root_y
            elif self.rank[root_x] > self.rank[root_y]:
                self.parent[root_y] = root_x
            else:
                self.parent[root_y] = root_x
                self.rank[root_x] += 1
 
    def connected(self, x, y):
        return self.find(x) == self.find(y)

크기 추적 템플릿

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.size = [1] * n
 
    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]
 
    def union(self, x, y):
        root_x = self.find(x)
        root_y = self.find(y)
 
        if root_x != root_y:
            if self.size[root_x] < self.size[root_y]:
                self.parent[root_x] = root_y
                self.size[root_y] += self.size[root_x]
            else:
                self.parent[root_y] = root_x
                self.size[root_x] += self.size[root_y]
 
    def get_size(self, x):
        return self.size[self.find(x)]
 
    def connected(self, x, y):
        return self.find(x) == self.find(y)

시간복잡도 분석

최적화	Find	Union	비고
없음	`O(n)`	`O(n)`	편향 트리
경로 압축	`O(log n)`	`O(log n)`	평균적으로 개선
Union by Rank	`O(log n)`	`O(log n)`	균형 유지
둘 다	`O(α(n))`	`O(α(n))`	최적

α(n): 역 아커만 함수 (실질적으로 4 이하)

응용 문제 패턴

1. 연결 요소 개수

uf = UnionFind(n)
for u, v in edges:
    uf.union(u, v)
components = len(set(uf.find(i) for i in range(n)))

2. 사이클 감지

uf = UnionFind(n)
for u, v in edges:
    if uf.connected(u, v):
        return True  # 사이클 발견
    uf.union(u, v)
return False

3. 동적 연결성

uf = UnionFind(n)
for query in queries:
    if query[0] == 'union':
        uf.union(query[1], query[2])
    else:  # connected
        print(uf.connected(query[1], query[2]))

기쁘달

탐색기

유니온 파인드 (Union-Find)

개념

시간복잡도

특징

주요 연산

장점

단점

Python 구현

1. 기본 구현

2. 경로 압축 (Path Compression)

3. Union by Rank (랭크 기반 합치기)

4. 집합 크기 추적

자주 나오는 문제 유형

1. 연결 요소 개수 세기

2. 사이클 감지

3. 크루스칼 알고리즘 (MST)

4. 친구 네트워크

5. 방 번호

6. 계정 병합

추천 연습 문제

기초

중급

고급

핵심 요약

최적화 기법

1. 경로 압축 (Path Compression)

2. Union by Rank

3. Union by Size

활용 분야

1. 그래프 알고리즘

2. 네트워크 문제

3. 이미지 처리

4. 게임 개발

주의사항

1. 초기화

2. Find 최적화 필수

3. Union 전 Find

4. 집합 분리 불가

유니온 파인드 템플릿

기본 템플릿

크기 추적 템플릿

시간복잡도 분석

응용 문제 패턴

1. 연결 요소 개수

2. 사이클 감지

3. 동적 연결성

그래프 뷰

목차

백링크