Excel解析工具类的设计与实现

作者: 浑身演技 | 来源:发表于2017-05-17 18:36 被阅读114次

Excel解析工具类的设计与实现
Springboot使用Apache poi Excel 实现E
c++模板实现词表解析
easyExcel类（Excel解析工具）
alibaba/easyexcel 框架使用
vue导入excel并解析
工具类
UI设计知识梳理-04
poi获取word2003表格内容
数据存储之python下excel的写入和读取(三)

文档撰写人员：张家骏，林平钏
团队开发人员：张红洁，张家骏，覃以才，于石，林平钏
背景：项目为CRM类金融系统，很多时候都需要将客户提交的Excel文件进行解析。在项目的开发中，我们团队意识到，对于Excel的解析是可以进行解耦出来，作为独立的工具类。

案例分析

简单Excel表格

我们可以清楚地意识到，标题其实是Bean中的属性。这意味着，Excel中，从第二行开始，每一行都是一个Bean的实例。

那么我们能不能通过某一个具体的Bean类，从Excel中的内容中获取所有的Bean的实例。

以上这就是我们团队的灵感来源。

设计

以下是团队提供的解决方案：
1.怎么设计Bean类

提供标题栏与Bean类的属性的元组。若没有，则转下一步；
Bean类的属性有无提供标题栏的注解。若有，组成上一步的元组，若没有，则转下一步；
直接按照标题栏与Bean类的属性的顺序，一一对应的获取，这里将不会组成元组。

2.如何获取Bean的实例

有元组模式

获取sheet的第一行作为标题栏，然后将元组的所有标题与sheet的所有标题，进行取交集。
然后将这些交集的值，一一赋值给Bean的实例。
举例来说：
元组里面含有【标题2，标题1，这是一个意想不到的】
标题栏含有【标题2，标题1，我的标题3，我的日期，超大的整数，这是提高】
取得交集【标题2，标题1】
没有元组模式：

直接获取从第二栏开始的内容，进行一一赋值。

实现

Bean类的注解

@Target(ElementType.FIELD)
@Retention(RetentionPolicy.RUNTIME)
@Inherited
public @interface CellTitle {
    public String value();
}

解析Sheet的标题栏的元组

public final class SheetToken {
    // excel的标题
    private String title;
    // Bean的属性名字
    private String property;
    public String getTitle() {
        return title;
    }
    public void setTitle(String title) {
        this.title = title;
    }
    public String getProperty() {
        return property;
    }
    public void setProperty(String property) {
        this.property = property;
    }
}

Sheet的解析器

public class SheetLexer {
    private static Logger log = LoggerFactory.getLogger(SheetLexer.class);
    // 要解析的sheet
    private Sheet sheet;
    // sheet的token列表
    private List<SheetToken> tokens;
    // 是否使用严格模式，若使用严格模式，则一旦出错，将不再进行解析，直接抛出异常。默认不用严格模式
    private AtomicBoolean isStrict = new AtomicBoolean(false);

    public SheetLexer() {
        super();
    }

    /**
     * 如果没有给tokens赋值，则默认是为按照sheet的列的顺序进行解析 <B>不推荐，因为此方式很容易出错，不进行解析</B>
     * 
     * @param sheet
     */
    public SheetLexer(Sheet sheet) {
        super();
        this.sheet = sheet;
    }

    public SheetLexer(Sheet sheet, AtomicBoolean isStrict) {
        super();
        this.sheet = sheet;
        this.isStrict = isStrict;
    }

    /**
     * 推荐给tokens赋值，不按照sheet的列顺序进行解析
     * 
     * @param sheet
     * @param tokens
     */
    public SheetLexer(Sheet sheet, List<SheetToken> tokens) {
        super();
        this.sheet = sheet;
        this.tokens = tokens;
    }

    /**
     * 推荐不使用严格模式。若使用严格模式，则一旦出错，将不再进行解析，直接抛出异常
     * 
     * @see SheetLexer(HSSFSheet sheet, List<SheetToken> tokens)
     * @param sheet
     * @param tokens
     * @param isStrict
     */
    public SheetLexer(Sheet sheet, List<SheetToken> tokens, AtomicBoolean isStrict) {
        super();
        this.sheet = sheet;
        this.tokens = tokens;
        this.isStrict = isStrict;
    }

    public AtomicBoolean getIsStrict() {
        return isStrict;
    }

    public void setIsStrict(AtomicBoolean isStrict) {
        this.isStrict = isStrict;
    }

    public Sheet getSheet() {
        return sheet;
    }

    public void setSheet(Sheet sheet) {
        this.sheet = sheet;
    }

    public List<SheetToken> getTokens() {
        return tokens;
    }

    public void setTokens(List<SheetToken> tokens) {
        this.tokens = tokens;
    }

    /**
     * 默认第一行作为列的标题。在严格模式下，列的长度必须等于Bean的属性长度。
     * 
     * @param <T>
     * 
     * @param clazz
     * @return
     */
    private <T> Map<String, Integer> convertFrom(Class<T> clazz) {
        Map<String, Integer> map = new ConcurrentHashMap<>();
        if (ObjectUtil.nonNull(sheet)) {
            final int titleIndex = 0;
            if (sheet.getLastRowNum() >= titleIndex) {
                Row sheetTitles = sheet.getRow(titleIndex);
                if ((sheetTitles.getLastCellNum() - sheetTitles.getFirstCellNum()) != clazz.getDeclaredFields().length
                        && isStrict.get()) {
                    throw new RuntimeException("columns length cannot equals to field size in the strict mode");
                }

                for (int i = sheetTitles.getFirstCellNum(); i < sheetTitles.getLastCellNum(); i++) {
                    Cell cell = sheetTitles.getCell(i);
                    if (ObjectUtil.nonNull(cell) && StringUtils.isNotBlank(cell.getStringCellValue())) {
                        if (CollectionUtils.isNotEmpty(tokens)) {
                            for (SheetToken token : tokens) {
                                if (cell.getStringCellValue().equals(token.getTitle())) {
                                    map.put(cell.getStringCellValue(), cell.getColumnIndex());
                                }
                            }
                        } else {
                            map.put(cell.getStringCellValue(), cell.getColumnIndex());
                        }
                    }

                }
            }
        }
        return map;
    }

    private <T> void initTokens(Class<T> clazz) {
        if (CollectionUtils.isEmpty(tokens)) {

            Field[] fields = clazz.getDeclaredFields();
            for (Field field : fields) {
                if (field.isAnnotationPresent(CellTitle.class)) {
                    if (CollectionUtils.isEmpty(tokens))
                        tokens = new ArrayList<SheetToken>();
                    SheetToken token = new SheetToken();
                    token.setProperty(field.getName());
                    token.setTitle(field.getAnnotation(CellTitle.class).value());
                    tokens.add(token);
                }
            }
        }

    }

    /**
     * 实例化属性值
     * 
     * @param field
     * @param cell
     * @param t
     * @throws IllegalArgumentException
     * @throws IllegalAccessException
     */
    private <T> void instantiateField(Field field, Cell cell, T t)
            throws IllegalArgumentException, IllegalAccessException {
        if (ObjectUtil.nonNull(field) && ObjectUtil.nonNull(cell) && ObjectUtil.nonNull(t)) {
            try {
                boolean flag = field.isAccessible();
                field.setAccessible(true);
                if (cell.getCellType() == Cell.CELL_TYPE_STRING) {
                    field.set(t, cell.getStringCellValue());
                } else if (cell.getCellType() == Cell.CELL_TYPE_BOOLEAN)
                    field.set(t, cell.getBooleanCellValue());
                else if (cell.getCellType() == Cell.CELL_TYPE_NUMERIC) {
                    if (field.getType() == Integer.TYPE) {
                        field.setInt(t, (int) cell.getNumericCellValue());
                    } else if (field.getType() == Date.class) {
                        field.set(t, cell.getDateCellValue());
                    } else if (field.getType() == BigDecimal.class) {
                        field.set(t, new BigDecimal(cell.getNumericCellValue()));
                    } else if (field.getType() == String.class) {
                        if (cell.getNumericCellValue() == ((int) cell.getNumericCellValue()))
                            field.set(t, String.valueOf((int) cell.getNumericCellValue()));
                        else
                            field.set(t, String.valueOf(cell.getNumericCellValue()));
                    } else {
                        field.set(t, cell.getNumericCellValue());
                    }
                }

                field.setAccessible(flag);
            } catch (Exception e) {
                log.error("{} happen when instantiate field {} from sheet ", e.getMessage(), field.getName());
                if (isStrict.get())
                    throw new RuntimeException(e.getMessage());
            }

        }
    }

    /**
     * <p>
     * 按照所给出模型的property中的名字和属性，然后通过反射找出所有列。
     * </p>
     * <p>
     * 如果property与列存在着差异，则取两者的交集。
     * </p>
     * 
     * @param row
     * @param clazz
     * @return
     * @throws IllegalAccessException
     * @throws InstantiationException
     * @throws SecurityException
     * @throws NoSuchFieldException
     */
    public <T> T nextRow(int row, Class<T> clazz)
            throws InstantiationException, IllegalAccessException, NoSuchFieldException, SecurityException {
        T t = clazz.newInstance();
        if (ObjectUtil.nonNull(sheet)) {
            Map<String, Integer> map = convertFrom(clazz);
            // 获取的行数必须小于等于总行数
            if (row <= sheet.getLastRowNum()) {
                Row rowContent = sheet.getRow(row);
                if (ObjectUtil.nonNull(tokens)) {
                    for (Entry<String, Integer> entry : map.entrySet()) {
                        Cell cell = rowContent.getCell(entry.getValue());
                        Field field = null;
                        if (ObjectUtil.nonNull(tokens)) {
                            for (SheetToken token : tokens) {
                                try {
                                    if (entry.getKey().equals(token.getTitle()))
                                        field = clazz.getDeclaredField(token.getProperty());
                                } catch (Exception e) {
                                    log.error("no such field {} in the excel titles", token.getProperty());
                                    if (isStrict.get())
                                        throw e;
                                }

                            }
                        }
                        instantiateField(field, cell, t);
                    }
                } else {
                    Field[] fields = clazz.getDeclaredFields();
                    if (isStrict.get()
                            && fields.length != (rowContent.getLastCellNum() - rowContent.getFirstCellNum())) {
                        throw new RuntimeException("columns length cannot equals to field size in the strict mode");
                    }
                    for (int i = rowContent.getFirstCellNum(); i < rowContent.getLastCellNum(); i++) {
                        instantiateField(fields[i], rowContent.getCell(i), t);
                    }
                }

            }
        }
        return t;
    }

    /**
     * <p>
     * 获取行数，(begin,end]
     * </p>
     * 
     * @param begin
     * @param end
     * @param clazz
     * @return
     */
    public <T> List<T> obtainRows(int begin, int end, Class<T> clazz) {
        List<T> list = new ArrayList<>();
        for (int i = begin; i <= end; i++) {
            try {
                list.add(nextRow(i, clazz));
            } catch (Exception e) {
                log.error("{} happen when instantiate {} from sheet ", e.getMessage(), clazz.getName());
                if (isStrict.get()) {
                    String message = e.getMessage();
                    if (e instanceof NoSuchFieldException) {
                        message = "no such field named " + e.getMessage() + " in the bean named " + clazz.getName();
                    }
                    throw new RuntimeException(message);
                }

            }
        }
        return list;
    }

    /**
     * <p>
     * 获取sheet的所有行
     * <p>
     * 
     * @param clazz
     * @return
     */
    public <T> List<T> allRows(Class<T> clazz) {
        initTokens(clazz);
        List<T> list = new ArrayList<T>();
        if (ObjectUtil.nonNull(sheet)) {
            list = obtainRows(sheet.getFirstRowNum() + 1, sheet.getLastRowNum(), clazz);
        }
        return list;
    }
}

测试案例

测试实体类

public class TestToken {
    private int first;
    private double second;
    @CellTitle("我的标题3")
    private String thrid;   
    private Date fourth;
    private BigDecimal bigNum;
    public BigDecimal getBigNum() {
        return bigNum;
    }
    public void setBigNum(BigDecimal bigNum) {
        this.bigNum = bigNum;
    }
    public int getFirst() {
        return first;
    }
    public void setFirst(int first) {
        this.first = first;
    }
    public double getSecond() {
        return second;
    }
    public void setSecond(double second) {
        this.second = second;
    }
    public String getThrid() {
        return thrid;
    }
    public void setThrid(String thrid) {
        this.thrid = thrid;
    }
    public Date getFourth() {
        return fourth;
    }
    public void setFourth(Date fourth) {
        this.fourth = fourth;
    }
}

测试方法

public static void main(String[] args) throws FileNotFoundException, IOException {
        Workbook wb = null;
        try (FileInputStream input = new FileInputStream(new File("C:/Users/thinkive/Desktop/test.xlsx"))) {
            try {
                wb = new XSSFWorkbook(input);
            } catch (Exception e) {
                POIFSFileSystem fs = new POIFSFileSystem(input);
                wb = new HSSFWorkbook(fs);
            }
        } catch (Exception e) {

        }
        Sheet sheet = wb.getSheetAt(0);
        {
            List<TestToken> testTokens = new SheetLexer(sheet).allRows(TestToken.class);
            System.out.println(new GsonBuilder().setPrettyPrinting().create().toJson(testTokens));
        }
}

[
{
"first": 0,
"second": 0.0,
"thrid": "123.1234"
},
{
"first": 0,
"second": 0.0,
"thrid": "我轻轻地，尝一口，你说的爱我"
}
]

Excel解析工具类的设计与实现
文档撰写人员：张家骏，林平钏团队开发人员：张红洁，张家骏，覃以才，于石，林平钏背景：项目为CRM类金融系统，很多时...
Springboot使用Apache poi Excel 实现E
使用Apache poi Excel实现Excel导出数据的工具类1、添加maven依赖 2、Excel导出工具类...
c++模板实现词表解析
经过简化的需求如下：请设计并实现一个通用的词表解析读取工具类(一个或者几个class/struct组成)，解析并...
easyExcel类（Excel解析工具）
一、介绍 Java解析、生成Excel比较有名的框架有Apache poi、jxl。但他们都存在一个严重的问题就...
alibaba/easyexcel 框架使用
JAVA解析Excel工具easyexcel Java解析、生成Excel比较有名的框架有Apache poi、j...
vue导入excel并解析
使用xlsx插件实现excel解析
工具类
系统工具类 snackbar工具类 xml解析工具类
UI设计知识梳理-04
工具类web官网设计实战工具类web官网设计实战-石墨文档工具类web端官网与tob类官网的区别: 工具类官网...
poi获取word2003表格内容
上代码： maven依赖： word工具类写入excel工具类：
数据存储之python下excel的写入和读取(三)
数据存储之python下excel的保存和读取(三) excel常用的处理表格类数据的工具。python下可实现e...