Spring Batch 실습 3 : 시나리오 2 배치 시스템 쿼리 최적화 & 성능 비교

Spring Batch 실습 : 배치 쿼리 최적화, Grafana 성능 측정&비교

Jun 14, 2025

Spring Batch 실습 3 : 시나리오 2 배치 시스템 쿼리 최적화 & 성능 비교

Contents

개요 Scenario2 Version1 Version2 개선 내용 ✅ 개선 배경 🔧 개선한 내용 💡개선 결과 🔍 성능 비교 (Grafana 기반)

이 글은 Spring Batch 기반 배치 시스템에서 쿼리 최적화 및 메모리 효율 개선을 통해 성능을 향상시킨 사례를 다룹니다.
성능 지표는 Grafana Dashboard를 통해 시각적으로 비교하였습니다.
만약 모니터링 설정과, 실습 환경이 궁금하시다면 이전 글을 확인해주세요.

개요

이번 시간에 이어서 실습 시나리오 2를 작성해봅시다.

또한 실습 시나리오2의 코드에서, 몇가지 최적화를 거친 Version2 배치 시스템을 작성해봅시다.

그리고 두가지 버전을 비교하여 모니터링 하여 성능을 측정해보겠습니다.

Scenario2

시나리오2 배치시스템 설명

목적: 월별 상품 판매 집계

Input: 'orders`, `order_items`

Output: `monthly_sales_summary`

학습 포인트: Join 처리, 월 단위 데이터 필터링, JobParameter 사용

Version1

먼저 초기 작성한 버전입니다.


package chan.springbatch.springbatchex.config.job;

import chan.springbatch.springbatchex.config.job.listener.PushGatewayJobListener;
import chan.springbatch.springbatchex.domain.order.OrderItem;
import chan.springbatch.springbatchex.domain.order.Orders;
import chan.springbatch.springbatchex.domain.summary.MonthlySalesSummary;
import chan.springbatch.springbatchex.domain.user.User;
import chan.springbatch.springbatchex.repository.MonthlySalesSummaryRepository;
import jakarta.persistence.EntityManagerFactory;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobExecutionListener;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.job.builder.JobBuilder;
import org.springframework.batch.core.repository.JobRepository;
import org.springframework.batch.core.step.builder.StepBuilder;
import org.springframework.batch.item.ItemProcessor;
import org.springframework.batch.item.ItemWriter;
import org.springframework.batch.item.database.JpaCursorItemReader;
import org.springframework.batch.item.database.JpaPagingItemReader;
import org.springframework.batch.item.database.builder.JpaCursorItemReaderBuilder;
import org.springframework.batch.item.database.builder.JpaPagingItemReaderBuilder;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.transaction.PlatformTransactionManager;

import java.time.LocalDateTime;
import java.time.YearMonth;
import java.util.HashMap;
import java.util.Map;

@Slf4j
@Configuration
@RequiredArgsConstructor
public class MonthlySalesSummaryJobConfig {

    private static final int CHUNK_SIZE = 1000;
    private final JobRepository jobRepository;
    private final EntityManagerFactory emf;
    private final PlatformTransactionManager transactionManager;
    private final MonthlySalesSummaryRepository monthlySalesSummaryRepository;

    @Bean
    public Job monthlySalesSummaryJob() {
        return new JobBuilder("monthlySalesSummaryJob", jobRepository)
                .start(monthlySalesSummaryStep())
                .listener(new JobExecutionListener() {
                    @Override
                    public void afterJob(JobExecution jobExecution) {
                        jobExecution.getStepExecutions().forEach(stepExecution -> {
                            System.out.println("[" + stepExecution.getStepName() + "] readCount: " + stepExecution.getReadCount());
                        });
                    }
                })
                .build();
    }

    @Bean
    public Step monthlySalesSummaryStep() {
        return new StepBuilder("monthlySalesSummaryStep", jobRepository)
                .<Orders, MonthlySalesSummary>chunk(CHUNK_SIZE, transactionManager)
                .reader(orderJpaCursorItemReader())
                .processor(orderToSummaryProcessor())
                .writer(monthlySalesSummaryWriter())
                .build();

    }

    @Bean
    public JpaCursorItemReader<Orders> orderJpaCursorItemReader() {
        return new JpaCursorItemReaderBuilder<Orders>()
                .name("orderJpaCursorItemReader")
                .entityManagerFactory(emf)
                .queryString("""
                    SELECT o
                    FROM Orders o
                    LEFT JOIN FETCH o.user
                    LEFT JOIN FETCH o.orderItems
                    WHERE o.orderAt BETWEEN :startDate AND :endDate
                """)
                .parameterValues(Map.of(
                        "startDate", LocalDateTime.of(2025, 5, 1, 0, 0),
                        "endDate", LocalDateTime.of(2025,5,31,23,59,59)
                ))
                .build();
    }

    @Bean
    public ItemProcessor<Orders, MonthlySalesSummary> orderToSummaryProcessor() {
        return order -> {
            log.info("orders : {}" , order);
            User user = order.getUser();
            YearMonth month = YearMonth.from(order.getOrderAt());
            long total = order.getOrderItems().stream()
                    .mapToLong(item -> (long) item.getQuantity() * item.getPrice())
                    .sum();

            return MonthlySalesSummary.builder()
                    .user(user)
                    .yearMonth(month.toString())
                    .monthlySpendMoney(total)
                    .monthlyOrderCount(1)
                    .build();
        };
    }


    @Bean
    public ItemWriter<MonthlySalesSummary> monthlySalesSummaryWriter() {
        return items -> {
            log.info("Writing {} summaries", items.size());
            Map<String, MonthlySalesSummary> summaryMap = new HashMap<>();

            for (MonthlySalesSummary item : items) {
                String key = item.getUser().getId() + "_" + item.getYearMonth();
                summaryMap.compute(key, (k, existing) -> {
                    if (existing == null) {
                        return MonthlySalesSummary.builder()
                                .user(item.getUser())
                                .yearMonth(item.getYearMonth())
                                .monthlySpendMoney(item.getMonthlySpendMoney())
                                .monthlyOrderCount(item.getMonthlyOrderCount())
                                .build();
                    } else {
                        existing.accumulateSpendMoney(item.getMonthlySpendMoney());
                        existing.plusOrderCount(item.getMonthlyOrderCount());
                        return existing;
                    }
                });
            }

            // 기존 저장된 데이터 조회 후 병합
            for (Map.Entry<String, MonthlySalesSummary> entry : summaryMap.entrySet()) {
                MonthlySalesSummary newSummary = entry.getValue();
                Long userId = newSummary.getUser().getId();
                String yearMonth = newSummary.getYearMonth();

                monthlySalesSummaryRepository
                        .findByUserIdAndYearMonth(userId, yearMonth)
                        .ifPresentOrElse(
                                existing -> {
                                    existing.accumulateSpendMoney(newSummary.getMonthlySpendMoney());
                                    existing.plusOrderCount(newSummary.getMonthlyOrderCount());
                                },
                                () -> monthlySalesSummaryRepository.save(newSummary)
                        );
            }
        };
    }
}

Version2

Version1을 완성하고 쿼츠로 실행시켜 잘 작동하는 것을 확인하였습니다.

그 이후 Writer의 성능을 변경한 Version2를 작성했습니다.


package chan.springbatch.springbatchex.config.job;

import chan.springbatch.springbatchex.config.job.listener.PushGatewayJobListener;
import chan.springbatch.springbatchex.domain.order.OrderItem;
import chan.springbatch.springbatchex.domain.order.Orders;
import chan.springbatch.springbatchex.domain.summary.MonthlySalesSummary;
import chan.springbatch.springbatchex.domain.user.User;
import chan.springbatch.springbatchex.repository.MonthlySalesSummaryRepository;
import jakarta.persistence.EntityManagerFactory;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobExecutionListener;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.job.builder.JobBuilder;
import org.springframework.batch.core.repository.JobRepository;
import org.springframework.batch.core.step.builder.StepBuilder;
import org.springframework.batch.item.ItemProcessor;
import org.springframework.batch.item.ItemWriter;
import org.springframework.batch.item.database.JpaCursorItemReader;
import org.springframework.batch.item.database.JpaPagingItemReader;
import org.springframework.batch.item.database.builder.JpaCursorItemReaderBuilder;
import org.springframework.batch.item.database.builder.JpaPagingItemReaderBuilder;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.transaction.PlatformTransactionManager;

import java.time.LocalDateTime;
import java.time.YearMonth;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.function.Function;
import java.util.stream.Collectors;

@Slf4j
@Configuration
@RequiredArgsConstructor
public class MonthlySalesSummaryJobConfig {

    private static final int CHUNK_SIZE = 1000;
    private final JobRepository jobRepository;
    private final EntityManagerFactory emf;
    private final PlatformTransactionManager transactionManager;
    private final MonthlySalesSummaryRepository monthlySalesSummaryRepository;

    @Bean
    public Job monthlySalesSummaryJob() {
        return new JobBuilder("monthlySalesSummaryJob", jobRepository)
                .start(monthlySalesSummaryStep())
                .listener(new JobExecutionListener() {
                    @Override
                    public void afterJob(JobExecution jobExecution) {
                        jobExecution.getStepExecutions().forEach(stepExecution -> {
                            System.out.println("[" + stepExecution.getStepName() + "] readCount: " + stepExecution.getReadCount());
                        });
                    }
                })
                .build();
    }

    @Bean
    public Step monthlySalesSummaryStep() {
        return new StepBuilder("monthlySalesSummaryStep", jobRepository)
                .<Orders, MonthlySalesSummary>chunk(CHUNK_SIZE, transactionManager)
                .reader(orderJpaCursorItemReader())
                .processor(orderToSummaryProcessor())
                .writer(monthlySalesSummaryWriter())
                .build();

    }

    @Bean
    public JpaCursorItemReader<Orders> orderJpaCursorItemReader() {
        return new JpaCursorItemReaderBuilder<Orders>()
                .name("orderJpaCursorItemReader")
                .entityManagerFactory(emf)
                .queryString("""
                    SELECT o
                    FROM Orders o
                    LEFT JOIN FETCH o.user
                    LEFT JOIN FETCH o.orderItems
                    WHERE o.orderAt BETWEEN :startDate AND :endDate
                """)
                .parameterValues(Map.of(
                        "startDate", LocalDateTime.of(2025, 5, 1, 0, 0),
                        "endDate", LocalDateTime.of(2025,5,31,23,59,59)
                ))
                .build();
    }

    @Bean
    public ItemProcessor<Orders, MonthlySalesSummary> orderToSummaryProcessor() {
        return order -> {
            log.info("orders : {}" , order);
            User user = order.getUser();
            YearMonth month = YearMonth.from(order.getOrderAt());
            long total = order.getOrderItems().stream()
                    .mapToLong(item -> (long) item.getQuantity() * item.getPrice())
                    .sum();

            return MonthlySalesSummary.builder()
                    .user(user)
                    .yearMonth(month.toString())
                    .monthlySpendMoney(total)
                    .monthlyOrderCount(1)
                    .build();
        };
    }


    @Bean
    public ItemWriter<MonthlySalesSummary> monthlySalesSummaryWriter() {
        return items -> {
            log.info("Writing {} summaries", items.size());

            // Step 1: (userId, yearMonth) 기준으로 메모리 상에서 집계
            Map<String, MonthlySalesSummary> summaryMap = new HashMap<>();
            for (MonthlySalesSummary item : items) {
                String key = item.getUser().getId() + "_" + item.getYearMonth();
                summaryMap.merge(key, item, (existing, incoming) -> {
                    existing.accumulateSpendMoney(incoming.getMonthlySpendMoney());
                    existing.plusOrderCount(incoming.getMonthlyOrderCount());
                    return existing;
                });
            }

            // Step 2: DB에서 기존 데이터 일괄 조회
            String yearMonth = summaryMap.values().stream().toList().get(0).getYearMonth();

            List<Long> userIds = summaryMap.values().stream()
                    .map(summary -> summary.getUser().getId())
                    .distinct()
                    .toList();

            // 효율적으로 조회 (userId, yearMonth) in (...)
            List<MonthlySalesSummary> existingSummaries =
                    monthlySalesSummaryRepository.findAllByUserIdInAndYearMonth(userIds, yearMonth);

            // 기존 데이터를 맵으로 변환 (key: userId_yearMonth)
            Map<String, MonthlySalesSummary> existingMap = existingSummaries.stream()
                    .collect(Collectors.toMap(
                            s -> s.getUser().getId() + "_" + s.getYearMonth(),
                            Function.identity()
                    ));

            // Step 3: 병합 및 저장
            List<MonthlySalesSummary> toSave = new ArrayList<>();
            for (Map.Entry<String, MonthlySalesSummary> entry : summaryMap.entrySet()) {
                String key = entry.getKey();
                MonthlySalesSummary newSummary = entry.getValue();

                if (existingMap.containsKey(key)) {
                    MonthlySalesSummary existing = existingMap.get(key);
                    existing.accumulateSpendMoney(newSummary.getMonthlySpendMoney());
                    existing.plusOrderCount(newSummary.getMonthlyOrderCount());
                    toSave.add(existing);
                } else {
                    toSave.add(newSummary);
                }
            }

            // 일괄 저장
            monthlySalesSummaryRepository.saveAll(toSave);
        };
    }

}

Version2에서 크게 달라진 점은 Writer에서 DB I/O, 메모리 사용량을 개선하였습니다.

개선 내용

✅ 개선 배경

기존 배치 시스템은 한 번의 청크 처리에서 너무 많은 쿼리를 발생시키고 있었습니다.

매 Order 처리 시 MonthlySalesSummary Select 쿼리

MonthlySalesSummary 와 User N+1 문제

매 Order마다 MonthlySalesSummary save() 를 호출하여 데이터베이스 부하 증가

🔧 개선한 내용

한 청크에서 사용하는 모든 MonthlySalesSummary를 한번에 조회하도록 변경

MonthlySalesSummary 와 User 를 Fetch Join하여 N+1문제 해결

한 청크당 MonthlySalesSummary saveAll() 을 호출하여 bulk update

집계 처리 로직 최적화

💡개선 결과

쿼리 개수 대폭 감소 (210,000 → 200)

Version1 쿼리 개수 : (MonthlySalesSummary 조회(1000) + User 조회(최대 100) + update쿼리(1000)) * 청크 개수(100) = 210000
Version2 쿼리 개수 : (MonthlySalesSummary 조회(1) + 청크 전체 Bulk update(1)) * 청크 개수(100) = 200

집계 처리 로직 메모리 사용 최적화

쿼리 개수는 로그로 확인할 수 있습니다.

배치 메트릭을 아래에서 확인해보며 어떤 성능이 좋아졌는지 확인해보겠습니다.

🔍 성능 비교 (Grafana 기반)

Grafana Snapshot

Grafana

If you're seeing this Grafana has failed to load its application files

https://snapshots.raintank.io/dashboard/snapshot/lNpq4JT2jIYDu5VSXq8idqHHG637gqNs?orgId=0&refresh=10s

아래부터는 성능 개선 전후의 차이를 Grafana로 수집한 메트릭을 기준으로 비교해보겠습니다.

빨간 선을 기준으로 왼쪽은 Version2(개선 후), 오른쪽은 Version1(개선 전)입니다.

📈 Item Read / Write Throughput

읽기/쓰기 처리량에는 큰 차이가 없었습니다.

📈 JVM Heap Memory 사용량

그래프 설명

초록색: Eden 영역 (Young Generation)

노란색: Old 영역

그래프 해석

Version2에서 Eden 영역 메모리 사용량이 확연히 낮습니다.

그 이유는 개선된 Version2의 Writer 로직에서 중간 객체를 덜 생성하고, GC 대상 객체도 줄었기 때문입니다.


// Version2 집계 로직
Map<String, MonthlySalesSummary> summaryMap = new HashMap<>();
for (MonthlySalesSummary item : items) {
    String key = item.getUser().getId() + "_" + item.getYearMonth();
    summaryMap.merge(key, item, (existing, incoming) -> {
        existing.accumulateSpendMoney(incoming.getMonthlySpendMoney());
        existing.plusOrderCount(incoming.getMonthlyOrderCount());
        return existing;
    });
}

📈 Avg Chunk Write Latency

해석

Version2에서 청크 당 평균 DB 쓰기 시간이 확연히 나아졌습니다.(약 2배)

이는 bulk update(saveAll) 및 불필요한 연산 제거로 인해 쓰기 성능이 개선된 결과로 예상합니다.

📈 Process CPU 사용량

CPU 사용량은 큰 차이가 없었습니다.

이번 개선 사항은 CPU보다는 DB와 메모리 효율을 최적화한 작업이라는 것을 알 수 있습니다.