반응형

apache poi(XSSFWorkbook API)가 갖고 있는 메모리 이슈로 인해 full gc가 발생하는 현상 해결을 위해

엑셀파일 read시에 메모리 사용증대가 발생되지 않는 개선된 api가 필요


구글링 결과 어떤 분이 감사한 모듈을 개발해 놓으셨다.

해결책은 StreamingReader 를 사용하는 것


해결을 위해서는 아래 내용을 참고하면 된다.

해당 블로그에 잘 설명되어 있음

https://github.com/monitorjbl/excel-streaming-reader


회사에서 해당 모듈을 사용하여 고질적 문제를 해결함.


개요는 아래와 같음

  <dependency>
    <groupId>com.monitorjbl</groupId>
    <artifactId>xlsx-streamer</artifactId>
    <version>2.0.0</version>
  </dependency>

Implementation Details

This library will take a provided InputStream and output it to the file system. The stream is piped safely through a configurable-sized buffer to prevent large usage of memory. Once the file is created, it is then streamed into memory from the file system.

The reason for needing the stream being outputted in this manner has to do with how ZIP files work. Because the XLSX file format is basically a ZIP file, it's not possible to find all of the entries without reading the entire InputStream.

This is a problem that can't really be gotten around for POI, as it needs a complete list of ZIP entries. The default implementation of reading from an InputStream in POI is to read the entire stream directly into memory. This library works by reading out the stream into a temporary file. As part of the auto-close action, the temporary file is deleted.

If you need more control over how the file is created/disposed of, there is an option to initialize the library with a java.io.File. This file will not be written to or removed:

File f = new File("/path/to/workbook.xlsx");
Workbook workbook = StreamingReader.builder()
        .rowCacheSize(100)    
        .bufferSize(4096)     
        .open(f);

This library will ONLY work with XLSX files. The older XLS format is not capable of being streamed.




반응형

'각종Tip모음' 카테고리의 다른 글

apache poi 메모리 이슈해결  (0) 2019.01.26
이클립스(eclipse)가 시동 시 응답없음  (0) 2013.12.24
@SuppressWarnings  (0) 2013.11.28

+ Recent posts