博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
SQL数据库的终结?
阅读量:6412 次
发布时间:2019-06-23

本文共 17901 字,大约阅读时间需要 59 分钟。

第一部分

SQL 的发展起始于 E.F.Codd 博士1970年六月发表于计算机协会的“通信”上的一篇论文, ““。当时他和他的在IBM工作的同事 Donald Chamberlin 和 Raymond Boyce 正在研究一种查询语言(最初叫做SQUARE, Specifying Queries As RelationalExpressions 的首字母缩写),并于1974年以论文”“将此成就推向顶峰。从此以后,  就成了关系数据库系统的最主要的语言。近些年,软件开发业内出现了一些体系框架和架构,主要目的是试图隐藏(或完全放弃)直接使用SQL 和关系数据库,让开发人员能够在应用开发中专注于用户界面,业务逻辑和平台支持上。 同时出现了一批被认为是关系型数替代品,称之为”NoSQL”的数据库。难道我们能够成为  和关系型数据库终结的见证人吗?

在一个由主持的中,我被问到:“随着ORMs(Object Relational Mapping 对象关系映射)的流行,有些软件开发者们认为SQL已经失去其价值了。你对这种观点有什么看法?” 我整个新年假期都在想这个问题,思考这个问题所隐含的意义已及ORM的未来,我花一段时间研究了一下像   和  这样的框架。这些框架仍然需要开发人员掌握关系数据的设计、开发和维护等知识。 Microsoft所开发的LINQ(.NET Language Integrated Query)也只是减少了编程语言和数据库语言之间的不兼容问题。

“ 运动”和  都是致力于彻底的将开发人员和SQL语言和关系数据库之间的依存斩断。一些程序员认为 NoSQL 运动是一种全新的感念。  最早出现于20世纪80年代,  于20世纪90年代最早将它商用于的文档数据存储业务。

Charlie Caro, 资深软件工程师,在美国 Embarcadero 开发   引擎,他告诉我:“在以前,人们普遍认为,不对数据的并发操作进行控制的数据库基本不可能被大家广泛接受。但 Ozzie 认识到,分布式、可复制性和易于安装的特征所带来的好处远胜于在管理文档数据和消息说很少能遇到的并发更新冲突控制所带来的好处。而且,如果文档数据如果需要确保被正确的修改、不能丢失数据,我们可以把配置切换到并发控制状态上,这是可选择的。但缺省状态是不考虑更新冲突控制的。”

, 根据  上的解释,是 “一种泛称(umbrella term),指那些非关系性的、定义不是很明确的数据存储仓库。“这个术语最早是  的员工  发明的。在他 里出现了 NoSQL (现在普遍认为是 Not Only SQL 的意思) 这个词。这篇真正的闪光点是”我们之所以要寻找一个其它类型的数据库的根本原因是想解决关系型数据库存在的各种弊端。“ Adam Keys 在他的博客  提供了另一个相似的术语:”Post-Relational”。一些 NoSQL 数据库还把消除那种关系型数据库对计算机资源、内存占用的问题作为一个目标。 NoSQL 的其他目标还包括:弱化与编程语言的关系,使用web技术和RPC调用方式可访问,以及可切换的数据查询方式。

在最近的一篇博客”“里 Michael Stonebraker 教授将 SQL 和 NoSQL 数据库进行了对比。  (注意:应该有更多的特征可以添加到下面的列表里。欢迎在评论里追加你认为能够区别这两种数据库的特征):

  • 横向和纵向扩展能力 – 关系型数据库(传统的数据库)通常部署在一台服务器上,通过增加处理器、内存和硬盘来进行升级。 部署在多台服务器上的关系型数据库通常是依赖相互复制来保持数据同步。 NoSQL 数据库可以部署在单服务器上,但更多的是部署成云状分布式 ()。
  • 列,key/value存储,数组(Tuples)存储 – 关系型数据库通常是有表或视图里的字段构成(固定的结构,用各种操作相互关联)。 NoSQL 数据库通常存储的是一对键值或  (结构不固定,只是一个有顺序的数据队列)。
  • 数据的内存和硬盘使用 – 关系型数据库通常是驻留在一个硬盘内或一个网络存储空间里。SQL查询或存储过程操作会把数据集提取到内存空间里。一些 (并不是全部) NoSQL 数据库可以直接在硬盘上操作,也可以通过内存来加快速度。
  • 面向文档型(Document-Oriented), 面向集合型(Collection-Oriented), 面向列型(Column-Oriented),
    面向对象型(Object-Oriented), 面向有序集合型(Set-Oriented), 面向行型(Row-Oriented) –  存储的是文档、属性和XML。面向集合型的数据集提供了更适合面向对象编程语言的特性。关系型数据库的特性是用表,行,列(面向列型)来组织数据。 SQL 查询操作通常返回的是指向包含特定列的某行或某些行的集合的指针。面向对象的数据库之所以出现是由于面向对象的编程的流行,但目前为止(以及将来很多年里)关系型数据库仍是数据存储模式里占有霸主地位。? 对象关系映射(ORM)框架的兴起将面向对象编程和大多数关系型数据紧紧的绑到了一起。 NoSQL 数据库里的数据通常是存储成对象、key/value、或数组(tuples)形式。 NoSQL 数据库的查询操作通常由编程代码或一个接口完成。

在一次邮件交流里,Charlie Caro 对我说了下面的话:”如果 Facebook 需要去管理 100,000,000 个用户的个人信息,一个分布式的、不依赖于环境的,、key-value 形式的存储模式是最适合不过了。在这样大数量的用户里查询会没有问题,但只要一个用户的更新操作就可能让传统的数据库过载宕机。多用户读数据时一个用户更新数据,这需要并发控制。在多数情况下, NoSQL 方案之所以能吸引它的用户群的原因是它的易于安装和使用的特征, SQL 数据库需要较多的运行条件(schema 等), 但正是这些schema方案给了并行关系型数据系统的高性能。易使用的好处更多的是体现在编程开发的时候。今天的许多程序员都更倾向于使用脚本语言,而不是相同功能的更安全的静态类型检查的编译型语言。脚本型语言只是容错性强和易于上手,有些软件能把这些脚本程序编译成 .NET/Java 字节码来提高运行性能。” 我和他都认为,所有的这一切都是为了让我们在工作中有更好的工具使用,而且从来都是这样!当有螺丝刀时谁还用锤子去钉螺丝钉。

第二部分

你想象不到,如今竟然有了那么多开源的/非开源的NoSQL数据库产品。而同时,每天都有新的品种出现。如果我的列举中遗漏了你喜爱的NoSQL数据库,请发评论告诉我。下面你将看到的就是各种不同类型的NoSQL数据库产品:面向文档的,面向集合的,面向列的,面向对象的,面向图的,面向有序集合的,面向行的,等等。

公司/组织: Franz Inc.
类型: Graph
简介: Modern, high performance, persistent graph database.
存储方案: Disk based, meta-data and data triples.
API(s): SPARQL, Prolog

公司/组织: Oracle
类型: Key/Value
简介: C language embeddable library for enterprise-grade, concurrent,transactional storage services. Thread safe to avoid data corruption or loss
存储方案: B-tree, hash table, persistent queue
API(s): C, C++ and Java
备注: Use BerkleyDB XML layer on top of BerkleyDB for XML based applications. 

公司/组织: Google
类型: Sparse, distributed, persistent multidimensional sorted map.
简介: Distributed storage system for structured data. Data model provides dynamic control over data layout and format. Data can live in memory or on disk.
存储方案: Data is stored as an uninterpreted array of bytes. Client applications can create structured and semi-structured data inside the byte arrays.
API(s): Python, GQL, Sawzall API, REST, various.
备注: Overview:  (PDF format)

公司/组织: Apache
类型: Dimensional hash table
简介: Highly scalable distributed database. Combines  distributed design and  column
family data model.
存储方案: Clusters of multiple keyspaces. The keyspace is a name space for column families. Columns are comprised of a name, value and timestamp.
API(s): Java, Ruby, perl, Python, C#, Thrift framework.
备注: Open sourced by Facebook in 2008. , , 

公司/组织: Apache
类型: Document
简介: Distributed database with incremental replication, bi-directional conflict detection and management.
存储方案: Ad-hoc and schema-free with a flat address space.
API(s): RESTful JSON API. JavaScript query language.
备注: CouchDB , 

公司/组织: Versant
类型: Object
简介: Java and .NET dual license (commercial and open source) object database.
存储方案: Data objects are stored in the way they are defined in the application.
API(s): Java, .NET languages.
备注: db4o , 

公司/组织: Millstone Creative Works
类型: JSON-based
简介: Schemaless database similar to Amazon’s SimpleDB. Open source, standalone Java application server.
存储方案: JSON data format, “bags” (similar to tables).
API(s): HTTP and Javascript APIs
备注: Dovetaildb reference manual

公司/组织: Cliff Moon
类型: Key/Value
简介: Open source  clone written in Erlang.
存储方案: Distributed key/valve store, Pluggable storage engines.
API(s): Thrift API
备注: Dynomite 

公司/组织: IBM
类型: In-memory grid/cache
简介: Distributed cache processes, partitions, replicates and manages data across servers.
存储方案: Data and database cache, “near cache” for local subset of data. Java persistent cache. Map reduce support.
API(s): Java APIs, REST data service
备注: eXtreme Scale  web site

公司/组织: FIS
类型: Hierarchical, multi-dimensional sparse arrays, content associative memory
简介: Small footprint, multi-dimensional array with fill support for ACID transactions, optimistic concurrency and software transactional memory.
存储方案: Unstructured array of bytes. Can be Key/Value, document oriented, schema-less, dictionary or any other data model.
API(s): Mumps, C/C++, SQL
备注: GT.M 

公司/组织: Christoph Rupp
类型: Embedded storage library
简介: Lightweight embedded database engine. Supports on disk and in memory databases.
存储方案: B+tree with variable length keys.
API(s): C++, Python, .NET and Java
备注: hamsterdb , , 

公司/组织: Apache
类型: Sparse, distributed, persistent multidimensional sorted map.
简介: Open source, distributed, column-oriented, “Bigtable like” store
存储方案: Data row has a sortable row key and an arbitrary number of columns, each containing arrays of bytes.
API(s): Java API, Thrift API, RESTful API
备注: Part of Apache  project. HBase , 

公司/组织: Zvents Inc.
类型: Sparse, distributed, persistent multidimensional sorted map.
简介: High performance distributed data storage system designed to run on distributed filesystems (but can run on local filesystems). Modeled
after Google Bigtable.
存储方案: Row key (primary key), column family, column qualifier, time stamp.
API(s): C++, Thrift API, HQL
备注: Hypertable , 

公司/组织: JBoss Community
类型: Grid/Cache
简介: Scalable, highly available, peer to peer, data grid platform.
存储方案: Key/Value pair with optional expiration lifespan.
API(s): Java, PHP, Python, Ruby, C
备注: Infinispan , 

公司/组织:  
类型: Graph
简介: Internet graph database made up on nodes and edges. Supports in-memory and persistent storage alternatives including RDBMS, file system, file grid, and custom storage.
存储方案: Nodes (meshobjects) and edges (relationships). Meshobjects can have entity types, properties and participage in relationships. MeshObjects raise events.
API(s): RESTful web services.
备注: InfoGrid , 

公司/组织: Scalien
类型: Key/Value
简介: Distributed (master/slave) key-value data store delivering strong consistency, fault-tolerance and high availability.
存储方案: Uses BErkeleyDB library for For local storage. Key/Value pairs and their state are replicated to multiple servers.
API(s): C/C++, Python, PHP, HTTP
备注: Keyspace , 

公司/组织:  
类型: Key/Value
简介: High performance, high realiability persistent storage engine for key/value object storage.
存储方案: Uses BerkeleyDB as storage library/backend.
API(s): Memcache protocol, C, Python, Java, perl
备注: MemcacheDB  (PDF format)

公司/组织: Ericsson
类型: Key/Value
简介: Multiuser distributed database including support for replication and dynamic reconfiguration.
存储方案: Organized as a set of tables made up of Erlang records. Tables also have properties including type location, persistence, etc.
API(s): Erlang
备注: Mnesia 

公司/组织: 10gen
类型: Document
简介: Scalable, high-performance, open source, schema-free, document-oriented database
存储方案: JSON-like data schemas, Dynamic queries, Indexing, replication, MapReduc
API(s): C,C++, Java, JavaScript, perl, PHP, Python, Ruby, C#, Erlang, Go, Groovy, Haskell, Scala, F#
备注: MongoDB 

公司/组织: Neo Technology
类型: Graph
简介: Embedded, small footprint, disk based, transactional graph database written in Java. Dual license – free and commercial.
存储方案: Graph-oriented data model with nodes, relationships and properties.
API(s): Java, Python, Ruby, Scala, Groovy, PHP, RESTful API.
备注: Neo4J , , 

公司/组织:  
类型: Key/Value
简介: Key/Value store with the dataset kept in memory and saved to disk asynchronously. “not just another key-value DB”
存储方案: Values can be strings, lists sets and sorted sets.
API(s): Python, Ruby, PHP, Erlang, Lua, C, C#, Java, Scala, perl
备注: Redis 

公司/组织: Amazon
类型: Item/Attribute/Value
简介: Scalable Web Service providing data storage, query and indexing in Amazon’s cloud.
存储方案: Items (like rows of data), Attributes (like column headers), and Values (can be multiple values)
API(s): SOAP, REST
备注: SimpleDB , , , 

公司/组织:
类型: Key/Value
简介: Library (written in C) of functions for managing files of key/value pairs. Multi-thread support.
存储方案: Keys and Values can have variable byte length. Binary data and strings can be used as a key and a value.
API(s): C, perl, Ruby, Java, Lua.
备注: Tokyo Cabinet , (PDF format). Also available: Tokyo Tyrant (remote service), Tokyo Distopia (full text search), Tokyo Promenade (content management).

公司/组织: LinkedIn
类型: Hash Table
简介: “It is basically just a big, distributed, persistent, fault-tolerant hash table.” High performance and availability.
存储方案: Each key is unique to a store. Each key can have at most one value. Supported types: JSON, string, identity, protobuf, java-serialization.
API(s): Java, C++, custom clients
备注: Project Voldemort , 

有如此多的非关系型数据库可选择真是一件好事。积累一些NoSQL相关的知识和初步体验能帮助管理人员、架构师、开发人员将所知道的关系型数据库的长处和短处跟NoSQL数据库进行对比。关系型数据库和SQL查询语言目前在各种数据库应用程序的设计、开发和管理过程中仍是主要元素和中枢系统。但当我们需要开始使用云数据库结构时,所有的我们了解的知识和收集的资料都能保证我们能迅速的进行迁移。这完全是根据用户和业务的需求,我们才能做出到底是使用现有的关系型数据库技术还是使用NoSQL进行替换。

第三部分

如果你想收集更多的关于 NoSQL 和 非关系型数据库的信息,请参考下面的一些网站,博客和文章:

  • , Eric Lai,Computerworld
  • Dynamo: Amazon’sWerner Voegel,Amazon CTO, from hisblog post and team article.
  • Google BigTable: Google Labs home page and . ““, a generic intro to NoSQL by Ben Scofield,
  • CodeMash January 14, 2010.
  • , by Wei Zhou, Pierre Guillaume and Chi Chi-Hung.Euro-Par 2009 conference (and the PDF ).
  • , by David Ramel for Redmond Developer News.
  • by Adam Keys, software developer and writer,on The Real Adam blog, August 2009.
  • , by Ping Li, Accel Partners.
  • from theCode Monkeyism blog
  • by Jonathan Ellis, on the RackSpace cloud blog.
  • , by Chris Williams, author Naked JavaScript and Co-Curator of NoSQL East conference, fromhis Voodoo Tiki God blog.
  • by Vineet Gupta, GMSoftware Engineering at Directi Group,on his blog.
  •  around the world from meetup.com.
  •  – website thatis “YourUltimate Guide to the Non-Relational Universe!”
  •  Google web discussion group

下面是几个将要举行的和最近刚举行的关于 NoSQL 的会议,架构师和开发人员能从这些会议里得到很有价值的信息。下面列出的只是其中的一部分:

  • , March 11,2010. Boston, Massachusetts. Hosted by 10gen (provides commercial support for MongoDB).
  • , May 26-27, Broomfield, Colorado.
  • , June 2-3, Stockholm Sweden.
  • , September 28-30, 2010, Frankfurt/Main, Germany. Workshops:  Workshop & Meetup 28th Sept 2010.
  • FOSDEM – 
  •  meet up November – 2009. On the meet up web site there are several links to papers that were presented including: , , , , , .

看看那些在  上和在Computerworld 博客上访问者留下的评论和建议是很有必要的。感谢那些参与关系和非关系数据库相关讨论的朋友。这里是从那些评论里节选的一部分:

  • Emil Eifrem (Neo4j) commented: “You talk about scaling to size and
    handling Facebook’s 100M user profiles. That’s an important use case and
    one that for example a key-value store handles brilliantly. But it
    turns out most companies aren’t Facebook. You can categorize the four
    emerging categories of NOSQL databases (key-value stores, column family
    stores, document dbs and graph databases) along the axes of scaling to
    size and scaling to complexity. For more information about that, see . Graph databases (like e.g. ,
    which I’m involved with, or )
    excels at representing complex and rapidly evolving domain models and
    then traversing them with high performance.”
  • Mongo-DB Developer commented: “We have seen the most common use case
    to date being use of nosql solutions as operational data store of web
    infrastructure projects. By operational, I mean, problems with real time
    writes and reads (contrast with data warehousing with bulk occasional
    loading). For these sort of problems these solutions work well and also
    fit well with agile development methods where the somewhat ‘schemaless’
    (or more accurately, columnless) nature of some of the solutions, and
    the dynamically typed nature of the storage, really helps.”
  • Peter R commented: “I have already seen, in the domain I work in,
    the movement away from straight up SQL databases. XML databases are one
    technology that will be stealing a lot of SQL’s thunder (if they haven’t
    already). Do I think SQL will ever die? No. But the key is that there
    will be/are more options that need to be thought about when designing a
    system now.”
  • Anonymous commented: “I agree object databases have a purpose. They
    are great for large datasets that need to be replicated and called by a
    key. However SQL provides a very important capability and that it is to
    be able to query data across a number of datasets very efficiently, this
    will be very hard to duplicate in a simple key value database.”
  • Johannes Ernst commented: “One of the difficulties for “normal”
    developers with many of the NoSQL technologies that you’ve described so
    far has been the learning curve and the additional work required: e.g.
    it’s easy and everybody knows how to put “every customer can place one
    or more orders” into a relational database, but what if the only thing
    you have is keys and opaque values? Compared to many other NoSQL
    alternatives, graph databases provide a high level of abstraction,
    freeing developers to concentrate on their application, while still
    bringing many of the same NoSQL benefits.For example, in InfoGrid (), a project I’m
    involved in, you can define “Customer” and “Order” and their
    relationship, and the InfoGrid graph database takes care of storing and
    retrieving data and enforcing the relationship. In our experience, that
    makes graph databases much more approachable to developers than many
    other NoSQL technologies.”
  • Database-ed commented: “The problem is that when folks think about
    storing information that they need to retrieve, they are so ingrained to
    SQL that they fail to think of other means. The Facebook example is a
    case in point. Who is ever going to ask for an accurate report of every
    user in Facebook? If you miss something the first time you go looking,
    you can always present it later. The end user doesn’t know you lost it,
    they assume it didn’t exist at the time and now it does. Yet you still
    need to store the data for easy retrieval. One problem with SQL is that
    it ties you into the relationships. Facebook is about letting people
    build the relationships based on the fields they want to build them on,
    not the ones you might think of. I know, it can be done within the
    confines of SQL, but it is a lot harder to do when the size gets large.”
  •  commented:
    “Some tasks that are poorly serviced by SQL may get switched over to a
    new method, but other implementations that are perfectly suited to SQL
    will continue using it. As they quoted Eric Evans in the article, “the
    whole point of seeking alternatives is that you need to solve a problem
    that relational databases are a bad fit for.”
  •  commented: “While I highly doubt there’s going to be any significant
    migration away from SQL and the like any time soon, I think more web
    developers will start experimenting with data stores and other data
    solutions as we move further into the cloud.”
  •  commented:
    “And as companies turn to ask their SQL DBAs what they think of this,
    they’ll say “lets stick with SQL.” Honestly, there are so many people
    that support SQL right now that will not switch any time soon this
    article is just bogus. You can’t make a switch like that until people
    can support it properly.”
  •  commented: “Document centric is pretty dumb if you plan on doing any
    sort of analytics and data mining. Great for workflow and such.”
  •  commented: “The
    significance of the NoSQL movement is that it adds new tools that offer
    better solutions to specific problems. The future probably belongs to
    NoSQL in the sense of ‘not-only SQL’, rather than ‘no SQL’. Don’t
    imagine that NoSQL solutions offer a free lunch though. I had an
    educational experience when I changed a view definition in a CouchDB
    data store and my first trivial query took an hour to come back. CouchDB
    can be pleasingly fast when all its indexes are built, but if you have
    to rebuild those indexes from scratch … well, let’s just say that’s
    not something you want to do on a live client-facing site.”
  •  commented: “ is one of the bigger proponents of
    Cassandra, a distributed data store in the vein of which the article is
    talking about. “
  •  commented:
    “SQL will be around for awhile. It’s good at doing what it was designed
    to do. However, there are many times when people use SQL simply because
    there is nothing better out there. As data complexity rises, a new
    method for accessing and persisting that data will have to be
    investigated. Part of the problem with many of the alternate solutions
    is that few people know how to use them.”

数年以后,我估计我们大多数还是要依赖于关系数据库和SQL。我当然有愿望,我将会不断的研究寻找更好的方式去弱化和封装数据访问操作。一直以来, 任何工程决策都是跟用户和业务需求相适应的。对于以后的软件工程来说,我相信,

我们一定会找到一个合适的非关系型数据存储产品。你是否正在使用非关系型数据库呢?你是否已经放弃了SQL和关系型数据库呢?你是否正在把你的数据转移到 一个公共的或私有的云数据库里呢?请发表评论。

原文发布时间为:2012-02-21

本文来自云栖社区合作伙伴“Linux中国”

转载地址:http://bgkra.baihongyu.com/

你可能感兴趣的文章
redis 系列5 数据结构之字典(上)
查看>>
爬虫数据库MongoDB的介绍
查看>>
4.2WebHost配置「深入浅出ASP.NET Core系列」
查看>>
Redis 哨兵Sentinel 高可用(学习笔记九)
查看>>
mybatis关于Criteria的用法小坑
查看>>
报考排队1小时?平安科技说只需90秒
查看>>
T-SQL学习中--窗口函数
查看>>
浅谈web开发
查看>>
Go 语言从新手到大神:每个人都会踩的五十个坑 (13-22)
查看>>
Android——Matrix变换矩阵的探索(1)
查看>>
04.构造函数 析构函数 拷贝函数
查看>>
到目前为止,生活教会给你最重要的一件事是什么?
查看>>
重拾Java(2)-运算符
查看>>
Linux系统诊断小技巧(15):启停问题之如何修复文件系统损坏
查看>>
Go语言基础语法-4
查看>>
使用Spring Boot 发送邮件(持续更新...)
查看>>
CentOS 7 安装Node
查看>>
初探性能优化--2个月到4小时的性能提升!
查看>>
Java NIO(七)Selector
查看>>
Hive操作大全(原创)
查看>>