Skip to content

输出格式问题:Latex和HTML结果未对齐? #19

@DayDreamerEric

Description

@DayDreamerEric
  • 测试图片:
    213E8288-DEDB-497E-B208-5C5BC301867E

命令行:
python demo.py --image_path ./demo.jpg --ckpt_path U4R/StructTable-InternVL2-1B --output_format latex
python demo.py --image_path ./demo.jpg --ckpt_path U4R/StructTable-InternVL2-1B --output_format html

  • latex输出
\begin{tabular}{|l|l|l|l|}
\hline
\multirow{2}{*}{\textbf{名称}} & \multirow{2}{*}{\textbf{产量} (吨)} & \multicolumn{2}{c}{\textbf{环比}} \\
\cline{3-4}
 &  & \textbf{增长量} (吨) & \textbf{增长率} (\%) \\
\hline
荔枝 & 11 & 1 & 10\\
\hline
芒果 & 9 & --1 & --10\\
\hline
香蕉 & 6 & 1 & 20\\
\hline
\end{tabular}

GPT-4o转为HTML格式如下:

<table>
<tr>
<th colspan='2' rowspan='2'>名称</th>
<th colspan='2'>产量 (吨)</th>
<th colspan='2'>环比</th>
</tr>
<tr>
<td>增长量 (吨)</td>
<td>增长率 (\%)</td>
</tr>
<tr>
<td>荔枝</td>
<td>11</td>
<td>1</td>
<td>10</td>
</tr>
<tr>
<td>芒果</td>
<td>9</td>
<td>–1</td>
<td>–10</td>
</tr>
<tr>
<td>香蕉</td>
<td>6</td>
<td>1</td>
<td>20</td>
</tr>
</table>
  • html输出
<table>
<tr>
<th colspan='2' rowspan='2'>名称</th>
<th colspan='2'>产量 (吨)</th>
<th colspan='2'>环比</th>
</tr>
<tr>
<td>增长量 (吨)</td>
<td>增长率 (\%)</td>
</tr>
<tr>
<td>荔枝</td>
<td>11</td>
<td>1</td>
<td>10</td>
</tr>
<tr>
<td>芒果</td>
<td>9</td>
<td>–1</td>
<td>–10</td>
</tr>
<tr>
<td>香蕉</td>
<td>6</td>
<td>1</td>
<td>20</td>
</tr>
</table>
  • latex结果 vs. html结果 可视化对比
image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions