【Hacker News搬运】JSON补丁
-
Title: JSON Patch
JSON补丁
Text:
Url: https://zuplo.com/blog/2024/10/10/unlocking-the-power-of-json-patch
由于我是一个AI,我无法直接访问外部网站,包括您提供的链接。因此,我无法直接使用JinaReader或其他工具来抓取和总结该网页内容。 不过,我可以提供一个大致的框架,说明如果您能够获取到该网页的内容,如何使用伪代码或概念来抓取、翻译和总结内容: 1. **抓取网页内容**: 使用Python的`requests`库来获取网页内容。 ```python import requests url = 'https://zuplo.com/blog/2024/10/10/unlocking-the-power-of-json-patch' response = requests.get(url) html_content = response.text
- 使用JinaReader分析内容:
假设JinaReader是一个可以处理HTML内容的库,您可以使用它来提取文本。
# 假设JinaReader有一个函数叫做extract_text from jina_reader import extract_text text = extract_text(html_content)
- 翻译非中文内容:
如果内容包含非中文,您可以使用一个翻译API,如Google Translate API,来将其翻译成中文。
from googletrans import Translator translator = Translator() translated_text = translator.translate(text, src='auto', dest='zh-cn').text
- 总结内容:
使用自然语言处理(NLP)技术来生成摘要。这里可以使用简单的文本摘要方法,或者使用更高级的模型如BART。
# 假设有一个函数叫做generate_summary from summary_generator import generate_summary summary = generate_summary(translated_text)
- 整合以上步骤:
import requests from googletrans import Translator from jina_reader import extract_text from summary_generator import generate_summary url = 'https://zuplo.com/blog/2024/10/10/unlocking-the-power-of-json-patch' response = requests.get(url) html_content = response.text text = extract_text(html_content) translator = Translator() translated_text = translator.translate(text, src='auto', dest='zh-cn').text summary = generate_summary(translated_text) print(summary)
请注意,以上代码是假设性的,因为JinaReader、Google Translate API、extract_text和generate_summary这些函数和库在实际中可能不存在或者需要不同的参数和用法。您需要根据实际可用的工具和库来调整代码。
## Post by: DataOverload ### Comments: **skrebbel**: I quite like JSON Patch but I've always felt that it's so convoluted only because of its goal of being able to modify every possible JSON document under the sun. If you allow yourself to restrict your data set slightly, you can patch documents much simpler.<p>For example, Firebase doesn't let you store null values. Instead, for Firebase, setting something to null means the same as deleting it. With a single simple restriction like that, you can implement PATCH simply by accepting a (recursive) partial object of whatever that endpoint. Eg if /books/1 has<p><pre><code> { title: "Dune", score: 9 } </code></pre> you can add a PATCH /books/1 that takes eg<p><pre><code> { score: null, author: "Frank Herbert" } </code></pre> and the result will be<p><pre><code> { title: "Dune", author: "Frank Herbert" } </code></pre> This is way simpler than JSON Patch - there's nothing new to learn, except "null means delete". IMO "nothing new to learn" is a fantastic feature for an API to have.<p>Of course, if you can't reserve a magic value to mean "delete" then you can't do this. Also, appending things to arrays etc can't be done elegantly (but partially mutating arrays in PATCH is, I'd wager, often bad API design anyway). But it solves a very large % of the use cases JSON Patch is designed for in a, in my humble opinion, much more elegant way. > **skrebbel**: 我非常喜欢JSON补丁,但我;我一直觉得;之所以如此复杂,只是因为它的目标是能够在阳光下修改所有可能的JSON文档。如果你允许自己稍微限制你的数据集,你可以更简单地修补文档<p> 例如,Firebase不会;t允许您存储空值。相反,对于Firebase,将某物设置为null意味着将其删除。通过这样一个简单的限制,您可以通过接受任何端点的(递归)部分对象来实现PATCH。例如,如果;书籍;1有<p><pre><code>{标题:《沙丘》,得分:9}</code></pre>您可以添加一个补丁;书籍;1,例如<p><pre><code>{分数:null,作者:“Frank Herbert”}</code></pre>结果将是<p><pre><code>{标题:《沙丘》,作者:《弗兰克·赫伯特》}</code></pre>这比JSON补丁简单得多——;这不是什么新鲜事,除了";null表示删除";。海事组织";没什么好学的";对于API来说是一个非常棒的特性<p> 当然,如果可以的话;不要保留一个神奇的值来表示";删除";那么您可以;不要这样做。此外,将内容附加到数组等可以;不能优雅地完成(但我敢打赌,PATCH中的部分突变阵列无论如何通常都是糟糕的API设计)。但在我看来,它以一种更优雅的方式解决了JSON补丁设计的很大一部分用例。 **bsimpson**: `/` is a weird choice of delimiter for JSON.<p>Since JSON is a subset of JS, I would have expected `.` to be the delimiter. That jives with how people think of JSON structures in code. (Python does require bracket syntax for traversing JSON, but even pandas uses dots when you generate a dataframe from JSON.)<p>When I see `/`, I think:<p>- "This spec must have been written by backend people," and<p>- "I wonder if there's some relative/absolute path ambiguity they're trying to solve by making all the paths URLs." > **bsimpson**: `/是JSON的一个奇怪的分隔符选择<p> 因为JSON是JS的一个子集,所以我会想到“”作为分隔符。这与人们对代码中JSON结构的看法相吻合。(Python确实需要括号语法来遍历JSON,但即使是pandas在从JSON生成数据帧时也会使用点。)<p>当我看到`x2F`时,我认为:<p>";该规范必须由后端人员编写。”;以及<p>-“;我想知道是否有;这是相对的;绝对路径模糊度;正在尝试通过使所有路径都成为URL来解决。”; **owobeid**: I've only use JSON Patch once as a quick hack to fix a problem I never thought I would encounter.<p>I had built a quick and dirty web interface so that a handful of people we contracted overseas can annotate some text data at the word level.<p>Originally, the plan was that the data was being annotated in small chunks (a sentence or two of text) but apparently the person managing the annotation team started assigning whole documents and we got a complaint that suddenly the annotations weren't being saved.<p>It turned out that the annotators had been using a dial up connection the entire time (in 2018!) and so the upload was timing out for them.<p>We panicked a bit until I discovered JSON Patch and I rewrote the upload code to only use the patch. > **owobeid**: 我;我只使用过一次JSON补丁,作为快速修复我从未想过会遇到的问题的工具<p> 我构建了一个快速而肮脏的网络界面,这样我们在海外签约的少数人就可以在单词级别注释一些文本数据<p> 最初,我们的计划是将数据以小块(一两句话)的形式进行注释,但显然注释团队的管理人员开始分配整个文档,我们收到了一份投诉,称注释突然变得不完整;没有得救<p> 事实证明,注释者一直在使用拨号连接(2018年!),因此上传对他们来说已经超时<p> 我们有点恐慌,直到我发现JSON补丁,我重写了上传代码,只使用补丁。 **zarzavat**: Why is the path a string and not an array? That means you have to have some way to escape / in keys, and also you need to parse the path string. Parser in parser syndrome. Or otherwise it can't handle arbitrary JSON documents. > **zarzavat**: 为什么路径是字符串而不是数组?这意味着你必须有办法逃跑;在键中,您还需要解析路径字符串。Parser in Parser syndrome。否则,它可以;t处理任意JSON文档。 **toomim**: JSON Patch doesn't let you represent insertions or deletions to strings. You can only replace them. This makes it useless for collaborative text editing. Thus, we can't use it in Braid-HTTP: <a href="https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-braid-http" rel="nofollow">https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-b...</a> > **toomim**: JSON补丁不支持;t允许您表示字符串的插入或删除。你只能替换它们。这使得它对协作文本编辑毫无用处。因此,我们可以;不要在Braid HTTP中使用它:<a href=“https:”datatracker.ietf.org:”doc:“html:”draft toomim httpbis Braid HTTP“rel=”nofollow“>https:”/;datatracker.ietf.org;doc/;html/;draft-tomim-httpbis-b</a>
- 使用JinaReader分析内容: