Difference between revisions of "PostgreSQL Adapter Project - Resources"

From CDOT Wiki
Jump to: navigation, search
(Connection to the driver)
(Connection to the driver)
Line 90: Line 90:
 
: 1- XA JDBC driver for Postgre (needed for NexJ Express Model)
 
: 1- XA JDBC driver for Postgre (needed for NexJ Express Model)
 
:: XA Data Source, as for what they mean: XA are for distributed transactions (as per the Open Group specificitions) and non-XA are not (transactions must be single-database). [http://www.theserverside.com/discussions/thread.tss?thread_id=21385]
 
:: XA Data Source, as for what they mean: XA are for distributed transactions (as per the Open Group specificitions) and non-XA are not (transactions must be single-database). [http://www.theserverside.com/discussions/thread.tss?thread_id=21385]
:: Hierarchy for package [http://jdbc.postgresql.org/development/privateapi/org/postgresql/xa/package-tree.html org.postgresql.xa]
+
:: Hierarchy for package [http://jdbc.postgresql.org/development/privateapi/org/postgresql/xa/package-tree.html org.postgresql.xa] - src of the hirearchy [http://grepcode.com/file/repository.jboss.com/maven2/postgresql/postgresql/8.3-603.jdbc3/org/postgresql/core/ConnectionFactory.java?av=f]
 
:: [http://jdbc.postgresql.org/todo.html Bugs]
 
:: [http://jdbc.postgresql.org/todo.html Bugs]
 
:: Postgresql Limited XA Support[http://www.atomikos.com/Documentation/KnownProblems#Postgresql_Limited_XA_Support] - PostgreSQL XA support is limited in what it can do; using Postgresql is not recommended and not supported. If you do choose to use it, see this forum post for some of the problems.
 
:: Postgresql Limited XA Support[http://www.atomikos.com/Documentation/KnownProblems#Postgresql_Limited_XA_Support] - PostgreSQL XA support is limited in what it can do; using Postgresql is not recommended and not supported. If you do choose to use it, see this forum post for some of the problems.

Revision as of 15:36, 13 December 2010

Postgre Project - Resources

Java

JBoss application server
example Salutation
example Dog

Database & Persistence

Database

Agile database practice [1]

  • Terminology
Collation @ msdn.microsoft.com
Cursors: Rather than executing a whole query at once, it is possible to set up a cursor that encapsulates the query, and then read the query result a few rows at a time. One reason for doing this is to avoid memory overrun when the result contains a large number of rows. [2]
Relational database Management System[3]
Indexes & Hints
Index
Hints - Hints are options or strategies specified for enforcement by the SQL Server query processor on SELECT, INSERT, UPDATE, or DELETE statements. The hints override any execution plan the query optimizer might select for a query.
Query Hints - http://msdn.microsoft.com/en-us/library/ms181714.aspx
Index vs Index Hint http://www.doens.be/2010/10/index-vs-index-hint/
Trigger - A trigger is a specification that the database should automatically execute a particular function whenever a certain type of operation is performed. Triggers can be defined to execute either before or after any INSERT, UPDATE, orDELETE operation, either once per modified row, or once per SQL statement. If a trigger event occurs, the trigger's function is called at the appropriate time to handle the event.[4]
Wiki
Oracle Server Manual
MySQL - Syntax [5]
PostgreSQL- Syntax [6]
Stored Procedures
Oracle Server Manual
Transactions
Oracle Server Manual
Mapping Objects to Relational Databases @ agiledata.org
Converting Charsets [7]
ACID
In computer science, ACID (atomicity, consistency, isolation, durability) is a set of properties that guarantee database transactions are processed reliably. In the context of databases, a single logical operation on the data is called a transaction. For example, a transfer of funds from one bank account to another, even though that might involve multiple changes (such as debiting one account and crediting another), is a single transaction. [8]
MVCC
Multiversion concurrency control (abbreviated MCC or MVCC), in the database field of computer science, is a concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement transactional memory.[9]
Binary vs Integer for primary key
[10]

Persistence

  1. Eclipse Tutorial Videos on Persistence
  2. XML/Object Binding and Object Persistence = XStream + XBird (XML database system]
  3. data persistence with JDO
  4. java world jdo

JDBC

  1. JDBC on Java Tutorials
  2. JDBC @ http://en.wikipedia.org/wiki/JDBC
  3. Relational database Management System @ http://en.wikipedia.org/wiki/Relational_database_management_system
  4. JDBC Driver @ http://en.wikipedia.org/wiki/JDBC_driver
  5. Drivers table http://devapp.sun.com/product/jdbc/drivers
  6. PostgreSQL JDBC Documentation
To make sure that the Driver class passes through the class loader, you can do a lookup by class name, as shown in the Java code snippet in this example.

try { Class.forName("org.postgresql.Driver"); } catch (ClassNotFoundException cnfe) { System.err.println("Couldn't find driver class:"); cnfe.printStackTrace(); }

Class.forName is a method that finds a class by name. In this case, you look for the Driver. This causes the class loader to search through the CLASSPATH and find a class by that name. If it finds it, the class loader will then read in the binary description of the class. If it does not find it, it will throw a ClassNotFoundException, in which case you can print out an error message to that effect. If you reach this state, you either haven't built the driver correctly, or the .jar file is not in your classpath.
Once you have registered the Driver class, you need to request a connection to a PostgreSQL database. To do this, you use a class called DriverManager. The DriverManager class is responsible for handling JDBC URLs, finding an appropriate driver, and then using that driver to provide a connection to the database.
JDBC URLs are of the following format, in three colon-delimited parts:jdbc:[drivertype]:[database]
The first part, jdbc, is a constant. It represents that you are connecting to a JDBC data source. The second part, [drivertype], represents the kind of database you want to connect to. Use postgresql to connect to a PostgreSQL database. The third part is passed off to the driver, which finds the actual database. It takes on one of the following formats: databasename --- //hostname/databasename ------ //hostname:portnumber/databasename
In the first case, the PostgreSQL database is running on the local machine, on the default port number. The databasename is the literal name of the database you wish to connect to. The second case is used for when you want to specify a hostname and a database. This also uses the default port number. The third case allows you to specify a port number as well. Even if you use the first type of URL, the JDBC connection will always be made via TCP/IP.
For the purposes of the examples from now on, this chapter will use the URL: jdbc:postgresql://localhost/booktown, meaning you are connecting to host localhost and database booktown. With that in mind, try to make a connection, using all you have learned so far. Example 12-2 shows a simple Java program that opens a JDBC connection to the booktown database. If you run the example yourself, be sure to replace the username and password with values that will work on your system.

PostgreSQL Specifications

Connection to the driver

  • JDBC Interface
1- XA JDBC driver for Postgre (needed for NexJ Express Model)
XA Data Source, as for what they mean: XA are for distributed transactions (as per the Open Group specificitions) and non-XA are not (transactions must be single-database). [14]
Hierarchy for package org.postgresql.xa - src of the hirearchy [15]
Bugs
Postgresql Limited XA Support[16] - PostgreSQL XA support is limited in what it can do; using Postgresql is not recommended and not supported. If you do choose to use it, see this forum post for some of the problems.
PostgreSQL HeuristicMixed Exception [17] - On PostgreSQL make sure to set the max_prepared_transactions at least as high as the max_connections, or you get heuristic problems.
2-Simple JDBC driver for Postgre
JDBC Driver 4 for PostgreSQL 9.This is the current version of the driver. Unless you have unusual requirements (running old applications or JVMs), this is the driver you should be using. It supports Postgresql 7.2 or newer and requires a 1.4 or newer JVM. It contains support for SSL and the javax.sql package. It comes in two flavors, JDBC3 and JDBC4. If you are using the 1.6 JVM, then you should use the JDBC4 version.
JDBC3 Postgresql Driver, Version 9.0-801
JDBC4 Postgresql Driver, Version 9.0-801 [18]
imp--PostgreSQL Configuration Properties @ jdbc.postgresql.org
Practice using the PostgreSQL Driver [19] - [20] -[21]
Connection Pools And DataSources @ PostgreSQL Documentation
Issues specific to PostgreSQL and JDBC [22]
One issue is that JDBC does not do any client-side SQL parsing or syntax checking. SQL statements are passed off transparently to the database, whether or not they are valid. Therefore, if the SQL is valid on one vendor's database, but invalid on another vendor's database the implementation won't know until the actual connection is made and the SQL is sent across. Sun is attempting to deal with this problem, and there may be some provisions made to correct this, either in later versions of JDBC or in a different standard. Another issue is that each vendor has additional helper classes specific to that vendor. For instance, PostgreSQL has extensions for geometric data types. Other vendors won't support these extensions; they are specific to PostgreSQL. If you use such vendor-specific classes, your program will not work with another JDBC database, despite using the JDBC "standard."[23]
One advantage of the PostgreSQL JDBC driver is that it is a "Type 4" driver. This means that it is written in Pure Java, so it can be taken anywhere, and used anywhere as long as the platform it is used on has TCP/IP capabilities, because the driver only connects via TCP/IP.[24]
  • Using the Command line
psql
working with server
psql cheatsheet
  • GUI
PGAdimn Visual tour
  • Creating database [25]
@ PostgreSQL documentation - [26]

Mapping Data Types, Metadata

  • Mapping Overview
Because data types in SQL and data types in the Java programming language are not identical, there needs to be some mechanism for transferring data between an application using Java types and a database using SQL types. [27]
Mapping - Postgre [28]
  • Data Types
Bytea - Binary - hex
A binary string is a sequence of bytes. Unlike character strings, which usually contain text data, binary strings are used to hold non-traditional data such as pictures, voice, or mixed media. [29] & [30]
PostgreSQL can not store values of more than several thousands bytes within any data-type except large objects, nor can binary data be easily entered within single quotes. Instead, large objects (BLOB) are used to store very large values and binary data. From PostgreSQL documentation “String Functions and Operators”[6] we catch the function ENCODE(data bytea, type text). This function encodes binary data to an ASCII-only representation. The supported types are: base64, hex, escape.[31]
Using BYTEA Data with Java
Binary operations and functions: ENCODE & DECODE [32]
Example [33]
CREATE TABLE mytable (testcol BYTEA);

INSERT INTO mytable (testcol)
  VALUES (DECODE('013d7d16d7ad4fefb61bd95b765c8ceb', 'hex'))

SELECT ENCODE(testcol, 'hex') FROM mytable;   // prints: 013d7d16d7ad4fefb61bd95b765c8ceb
SELECT testcol FROM mytable;   // prints: \x013d7d16d7ad4fefb61bd95b765c8ceb
Why using 'BINARY strings' instead of INTEGER unique identifiers [34]
Compare JDBC data types with PostgreSQL types [35]
bit & bit varying - for storing binaries [36]
BLOBs [37]
Serial - Auto_Increment - Sequence
A sequence is a special kind of database object designed for generating unique numeric identifiers. It is typically used to generate artificial primary keys. Sequences are similar, but not identical, to the AUTO_INCREMENT concept in MySQL. Sequences are most commonly used via the serial pseudotype.[38] - [39]
Adding sequence [40]

Functions

  • Functions & Operators
String literals and string functions in postgreSQL @ PostgreSQl Documentation
Pattern Matching [http://pgsqld.active-venture.com/functions-matching.html
postgre gotchas
fetch size, cursor,...
  • Localization
Character set support -
The character set support in PostgreSQL allows you to store text in a variety of character sets (also called encodings), including single-byte character sets such as the ISO 8859 series and multiple-byte character sets such as EUC (Extended Unix Code), UTF-8, and Mule internal code. All supported character sets can be used transparently by clients, but a few are not supported for use within the server (that is, as a server-side encoding). The default character set is selected while initializing your PostgreSQL database cluster using initdb. It can be overridden when you create a database, so you can have multiple databases each with a different character set. An important restriction, however, is that each database's character set must be compatible with the database's LC_CTYPE (character classification) and LC_COLLATE (string sort order) locale settings. For C or POSIX locale, any character set is allowed, but for other locales there is only one character set that will work correctly. (On Windows, however, UTF-8 encoding can be used with any locale.) Note! Not all client APIs support all the listed character sets. For example, the PostgreSQL JDBC driver does not support MULE_INTERNAL, LATIN6, LATIN8, and LATIN10. [41]

MySQL Specifications

[43]
The BINARY and VARBINARY types are similar to CHAR and VARCHAR, except that they contain binary strings rather than non-binary strings. That is,they contain byte strings rather than character strings. This means that they have no character set, and sorting and comparison are based on the numeric values of the bytes in the values.(From MySQL Manual)
In MySQL SQL syntax the function HEX() can be used to get the hexadecimal value of one field of any data-type. [44]
Example
CREATE TABLE mytable (testcol BINARY(16));

INSERT INTO mytable (testcol)
  VALUES (0x013d7d16d7ad4fefb61bd95b765c8ceb);

SELECT hex(testcol) FROM mytable;   // prints: 013d7d16d7ad4fefb61bd95b765c8ceb
  • SQL mode
If strict SQL mode is not enabled and you assign a value to a BINARY or VARBINARY column that exceeds the column's maximum length, the value is truncated to fit and a warning is generated.
ANSI_QUOTES
Treat “”” as an identifier quote character (like the “`” quote character) and not as a string quote character. You can still use “`” to quote identifiers with this mode enabled. With ANSI_QUOTES enabled, you cannot use double quotation marks to quote literal strings, because it is interpreted as an identifier.
  • Storage Engine:
PostgreSQL is a unified database server with a single storage engine. MySQL has two layers, an upper SQL layer and a set of storage engines. [45]
InnoDB is a fully ACID transactional storage engine using MVCC (Multi-Version Concurrency Control) technology. It's the normal choice for most modern applications using MySQL. [46]
In some distributions, the default storage engine is MyISAM, which is not transaction safe. Setting the default engine to a transactional engine such as InnoDB is, however, trivial. MySQL has a query cache that does simple string matching before the parser to see whether a query has been processed recently and rapidly returns the result to the client application if it has, without the need to do any of the traditional database work. This is of considerable value to many read-mostly workloads. Cached queries are removed whenever any table involved in the query is changed so its usefulness declines as the rate of data changes increases. The query cache runs on a single thread and must consider each select, so it may eventually become a performance bottleneck at some point beyond 8 cores, but that's not usually the case. It can be turned off easily to check this and to see whether its small overhead is worthwhile for the particular workload. MySQL also supports network protocol-level compression which is an option that can be turned on by the client if the server allows it. This compresses everything to and from the server.
Because MySQL uses C escape syntax in strings (for example, “\n” to represent a newline character), you must double any “\” that you use in LIKE strings. For example, to search for “\n”, specify it as “\\n”. To search for “\”, specify it as “\\\\”; this is because the backslashes are stripped once by the parser and again when the pattern match is made, leaving a single backslash to be matched against.
Exception: At the end of the pattern string, backslash can be specified as “\\”. At the end of the string, backslash stands for itself because there is nothing following to escape. Suppose that a table contains the following values:

mysql> SELECT filename FROM t1; filename : C: | C:\ | C:\Programs | C:\Programs\

To test for values that end with backslash, you can match the values using either of the following patterns:

mysql> SELECT filename, filename LIKE '%\\' FROM t1;

filename | filename LIKE '%\\' |
C: | 0 |
C:\ | 1 |
C:\Programs | 0 |
C:\Programs\ | 1 |

mysql> SELECT filename, filename LIKE '%\\\\' FROM t1; ( Will have the same result as above)

  • The current versions are MySQL 5.1 and PostgreSQL 8.4.

PostgreSQL - MySQL Compare

  1. MySQL began development with a focus on speed while PostgreSQL began development with a focus on features and standards. Thus, MySQL was often regarded as the faster of the two. [48]
  2. Both PostgreSQL and MySQL support Not-Null, Unique, Primary Key and Foreign Key constraints. However MySQL silently ignores the CHECK constraint
  3. MySQL supports stored procedures, per se; PostgreSQL supports stored functions, which are in practice very similar.
  4. Trigger - Both PostgreSQL and MySQL support triggers. A PostgreSQL trigger can execute any user-defined function from any of its procedural languages, not just PL/pgsql.MySQL triggers are activated by SQL statements only. They are not activated by changes in tables made by APIs that do not transmit SQL statements to the MySQL Server; in particular, they are not activated by updates made using the NDB API. PostgreSQL has always been strict about making sure data is valid before allowing it into the database, and there is no way for a client to bypass those checks.
  5. DataTypes - PostgreSQL does not have an unsigned integer data type, but it has a much richer data type support in several aspects: standards compliance, the logically fundamental data type BOOLEAN, user-defined data types mechanism, built-in and contributed data types. PostgreSQL allows columns of a table to be defined as variable-length multidimensional arrays. Arrays of any built-in or user-defined base type, enum type, or composite type can be created. Arrays of domains are not yet supported. MySQL does not have network IP address data types that PostgreSQL has but does provide INET_ATON() and INET_NTOA() functions to convert IPv4 addresses to and from easily stored integers. Postgres does very good job supporting referential integrity, has transactions and rollbacks, foreign keys ON DELETE CASCADE and ON UPDATE CASCADE.
  6. Security - MySQL has exceptionally good fine-grained access control. You can GRANT and REVOKE whatever rights you want, based on user name, table name and client host name.
  7. Alter table - Postgres supports ALTER TABLE to some extent. You can ADD COLUMN, RENAME COLUMN and RENAME TABLE. MySQL has all options in ALTER TABLE - you can ADD column, DROP it, RENAME or CHANGE its type on the fly - very good feature for busy servers, when you don't want to lock the entire database to dump it, change definition and reload it back.
  8. Diagnostic Log - By default, PostgreSQL logs to stderr, meaning that it's highly installation specific where the dianostic information is put; on this author's system, the default ends up in /var/lib/pgsql/pgstartup.log. The default can be set to something more reasonable (such as syslog on unix, eventlog on Windows) by adjusting thelog_destination configuration parameter.
  9. Architecture
PostgreSQL is a unified database server with a single storage engine. MySQL has two layers, an upper SQL layer and a set of storage engines. When comparing the two it's typically necessary to specify which storage engines are being used with MySQL because that greatly affects suitability, performance and sometimes feature availability. The most commonly used storage engines in MySQL are InnoDB for full ACID support and high performance on large workloads with lots of concurrency and MyISAM for lower concurrency workloads or higher concurrency read-mostly workloads that don't need ACID properties. Applications can combine multiple storage engines as required to exploit the advantages of each. [49]
  1. Automatic key generation -
PostgreSQL doesn't support the standard's IDENTITY attribute. PostgreSQL's best offering for a column with auto-generated values is to declare a column of 'type' SERIAL:
CREATE TABLE tablename (
  tablename_id SERIAL,
  ...
)
MySQL doesn't support the standard's IDENTITY attribute. As an alternative, an integer column may be assigned the non-standard AUTO_INCREMENT attribute:
CREATE TABLE tablename (
  tablename_id INTEGER AUTO_INCREMENT PRIMARY KEY,
  ...
)

Links

Compare architecture @ Wiki
Compare SQL Implemenations[50]
Compare Postgre and MySQL [51]
Comparison based on Postgre website [52]
Compare Data types [53]
Compare MySQL, Oracle, PostgreSQL [http://www-css.fnal.gov/dsg/external/freeware/mysql-vs-pgsql.html
Converting MySQL to PostgreSQL -> data types & Functions & Command-line [54]