【Hacker News搬运】IronCalc–开源电子表格引擎
-
Title: IronCalc – Open-Source Spreadsheet Engine
IronCalc–开源电子表格引擎
Text:
Url: https://www.ironcalc.com/
由于我是一个文本和信息处理的AI,我无法直接访问外部网站或使用浏览器工具来抓取内容。不过,我可以告诉你如何使用JinaReader这样的工具来抓取和分析网页内容,以及如何处理非中文内容。 以下是使用JinaReader进行网页抓取和分析的步骤,以及如何处理非中文内容: 1. **安装和设置JinaReader**: - 首先,你需要安装JinaReader。由于JinaReader不是Python的标准库,你需要在你的Python环境中安装它。 - 使用pip安装JinaReader(假设已经安装了pip): ```bash pip install jina ``` 2. **抓取网页内容**: - 使用JinaReader的`Doc`类来抓取网页内容。 ```python from jina import Document from jina import Client client = Client() doc = Document() doc.load("https://www.ironcalc.com/") print(doc.text) ``` 3. **分析抓取的内容**: - 一旦你有了网页内容,你可以使用NLP库(如spaCy、NLTK或transformers)来分析文本。 ```python import spacy nlp = spacy.load("zh_core_web_sm") # 假设内容是中文 doc = nlp(doc.text) # 进行文本分析,例如:提取实体、关键词、情感分析等 for ent in doc.ents: print(ent.text, ent.label_) ``` 4. **处理非中文内容**: - 如果内容不是中文,你需要先将其翻译成中文。可以使用在线翻译API或本地翻译库。 - 例如,使用Google翻译API进行翻译: ```python from googletrans import Translator translator = Translator() translated_text = translator.translate(doc.text, src='auto', dest='zh-cn').text print(translated_text) ``` 5. **总结内容**: - 使用上述步骤抓取和翻译内容后,你可以使用文本摘要技术来生成内容的总结。 - 例如,使用Hugging Face的`transformers`库中的`Summarizer`模型: ```python from transformers import pipeline summarizer = pipeline("summarization") summary = summarizer(translated_text, max_length=130, min_length=30) print(summary[0]['summary_text']) ``` 请注意,上述代码示例是假设性的,并且可能需要根据实际的JinaReader版本和功能进行调整。此外,由于我无法直接访问外部网站,因此无法提供具体的代码执行结果。
Post by: kaathewise
Comments:
nhatcher: Hey! This is my project!
Amazed to see this here.
I'll try to answer questions people might havenhatcher: 嘿!这是我的项目!很惊讶在这里看到这个。我;我会试着回答人们可能有的问题
wuming2: It’s a great ambition to replace Excel and many went down this path before. Congratulations to even attempting to achieve it and going this far to do it.<p>Excel compatibility, when fully realized, will remove the major obstacle to adoption. Given the current stronghold Excel has on the market.<p>Once that is achieved do you plan to offer a transition to more modern forms of calculations as vectors and arrays formula panels together with frozen sheets of raw data and output? Thus separating logic and model.<p>Also when your solution adoption will have grown much larger you should learn from the experiences Bavaria and CERN went through. Microsoft stronghold more often than not has nothing to do with technical prowess.
wuming2: 取代Excel是一个很大的野心,许多人以前都走上了这条路。恭喜你甚至试图实现它,并走了这么远。<p>Excel兼容性,当完全实现时,将消除采用的主要障碍。鉴于Excel目前在市场上的据点<p> 一旦实现了这一点,您是否计划向更现代的计算形式过渡,如向量和数组公式面板以及冻结的原始数据和输出表?从而将逻辑和模型分开<p> 此外,当你的解决方案采用率大幅提高时,你应该从巴伐利亚和欧洲核子研究中心的经历中学习。微软的大本营往往与技术实力无关。
mgkimsal: Can't tell by the docs (but I've not dug in much yet) but... @nhatcher... here's a use case question.<p>Admins use the ironcalc UI to create their formulas. Is there a way to get those formulas to a backend and run the calcs on the server itself (separate from the user's browser) to get results based on input from other sources?<p>The UI half looks great so far. I have a colleague I was going to recommend this to, but they more need the 'run the formulas on the server' part more than anything else. They've got some custom Rust stuff running already, but having admins come up with their advanced formulas, then translating that to server code - that takes the time. It <i>seems</i> they may be able to use this to have this handle both ends of the workload, without the translation layer.<p>Is that something supported, or even feasible?<p>Thanks!
mgkimsal: 可以;从文档中看不出来(但我还没有深入了解),但是@恩哈彻。。。这里;这是一个用例问题<p> 管理员使用ironcalc UI创建他们的公式。是否有方法将这些公式传输到后端,并在服务器本身(与用户的浏览器分开)上运行计算,以根据其他来源的输入获得结果<p> 到目前为止,UI的一半看起来很棒。我有一位同事要向我推荐这个,但他们更需要;在服务器上运行公式;部分比其他任何东西都重要。他们;我已经运行了一些自定义的Rust东西,但让管理员提出他们的高级公式,然后将其翻译成服务器代码——这需要时间。<i>似乎</i>他们可以使用它来处理工作负载的两端,而不需要翻译层<p> 这是否得到支持,甚至可行<p> 谢谢!
phonon: This looks great! Do you use cached calculation chains for performance optimizations? Do you take volatile functions into account?<p><a href="https://learn.microsoft.com/en-us/office/vba/excel/concepts/excel-performance/excel-improving-calculation-performance#full-calculation-and-recalculation-dependencies" rel="nofollow">https://learn.microsoft.com/en-us/office/vba/excel/concepts/...</a>
phonon: 这看起来很棒!您是否使用缓存计算链进行性能优化?您是否考虑了volatile函数<p> <a href=“https:#x2F;learn.microsoft.com/#en-us#office#vba#excel#x2F概念#excel性能#excel提高计算性能#完全计算和重新计算依赖关系”rel=“nofollow”>https:/;learn.microsoft.com/;en us;办公室;vba;excel/;概念■</一
jvanderbot: Props on Rust not being mentioned here or on the index page.
jvanderbot: Rust上的Props在这里或索引页面上都没有提到。