apache poi(XSSFWorkbook API)가 갖고 있는 메모리 이슈로 인해 full gc가 발생하는 현상 해결을 위해
엑셀파일 read시에 메모리 사용증대가 발생되지 않는 개선된 api가 필요
구글링 결과 어떤 분이 감사한 모듈을 개발해 놓으셨다.
해결책은 StreamingReader 를 사용하는 것
해결을 위해서는 아래 내용을 참고하면 된다.
해당 블로그에 잘 설명되어 있음
회사에서 해당 모듈을 사용하여 고질적 문제를 해결함.
개요는 아래와 같음
This library will take a provided
InputStream and output it to the file system. The stream is piped safely through a configurable-sized buffer to prevent large usage of memory. Once the file is created, it is then streamed into memory from the file system.
The reason for needing the stream being outputted in this manner has to do with how ZIP files work. Because the XLSX file format is basically a ZIP file, it's not possible to find all of the entries without reading the entire InputStream.
This is a problem that can't really be gotten around for POI, as it needs a complete list of ZIP entries. The default implementation of reading from an
InputStream in POI is to read the entire stream directly into memory. This library works by reading out the stream into a temporary file. As part of the auto-close action, the temporary file is deleted.
If you need more control over how the file is created/disposed of, there is an option to initialize the library with a
java.io.File. This file will not be written to or removed:
File f = new File("/path/to/workbook.xlsx");
Workbook workbook = StreamingReader.builder()
This library will ONLY work with XLSX files. The older XLS format is not capable of being streamed.