컴라면
컴 라 면
컴라면
전체 방문자
오늘
어제
  • 분류 전체보기 (8)
    • Database (4)
      • PostgreSQL (4)
    • Languages (1)
      • C++ (1)
      • Shell Script (0)
    • 자료구조 (0)
    • 운영체제 (1)
      • Linux Tips (1)
    • 알고리즘 문제풀이 (0)
      • LeetCode (0)
    • 개발 툴 (2)
      • Git (2)
      • 팁 (0)
    • 삽질 (0)
      • mysql 설치 (0)
      • 리눅스 (0)
    • 기타 (0)
      • 마크다운 (0)
    • 티스토리 팁 (0)

블로그 메뉴

  • HOME
  • TAG
  • MEDIA LOG
  • LOCATION LOG
  • GUEST BOOK
  • ADMIN
  • WRITE

공지사항

인기 글

태그

  • readahead
  • postgres
  • Git
  • correlation
  • 설치
  • tpc-h
  • C++
  • tpch
  • std 네임스페이스
  • 생활코딩
  • blockdev
  • postgresql
  • 네임스페이스
  • namespace

최근 댓글

최근 글

티스토리

hELLO · Designed By 정상우.
컴라면

컴 라 면

Database/PostgreSQL

[Optimizer] PostgreSQL Optimizer의 Stuctures

2018. 5. 25. 12:08

PostgreSQL의 planner가 동작하는 3단계

1. Preprocessing

  • Target list, limit clauses의 단순화 ex) '2+2'를 4로 변경

  • Normalizing boolean expression 'NOT (NOT a)' 를 'a' 로 변경

  • Flattening AND/OR

     

2. Cheapest access path 구하기

  • Access path와 cost를 저장하기 위해 RelOptInfo 구조체를 생성

    • RelOptInfo 구조체는 make_one_rel() 함수에서 생성

      • baserestrictinfo : WHERE clauses 저장
      • indexlist : related indexes 저장
    • RelOptInfo는 PlannerInfo 구조체의 simple_rel_array에 저장

  • All possible access path의 Cost를 estimation하고 RelOptInfo에 access path를 저장

    • Path를 생성하고, seq scan 비용을 계산하여 Path에 기록한 뒤 RelOptInfo의 pathlist에 저장
    • Related index가 있으면, index access path를 생성하고 index scan path를 저장
    • bitmap scan이 가능하면 bitmap scan path를 생성하고 bitmap scan path를 저장
  • RelOptInfo의 pathlist에서 cheapest access path를 획득

  • LIMIT, ORDER BY, ARREGISFDD cost를 estimation

 

3. Plan tree 생성

 


PlannerInfo

typedef struct PlannerInfo
{
	NodeTag		type;
	Query	   	*parse;			/* the Query being planned */
	PlannerGlobal 	*glob;			/* global info for current planner run */
	Index		query_level;		/* 1 at the outermost Query */
	struct PlannerInfo *parent_root;	/* NULL at outermost Query */

	/*
	 * plan_params contains the expressions that this query level needs to
	 * make available to a lower query level that is currently being planned.
	 * outer_params contains the paramIds of PARAM_EXEC Params that outer
	 * query levels will make available to this query level.
	 */
	List		*plan_params;	/* list of PlannerParamItems, see below */
	Bitmapset  	*outer_params;

	/*
	 * simple_rel_array holds pointers to "base rels" and "other rels" (see
	 * comments for RelOptInfo for more info).  It is indexed by rangetable
	 * index (so entry 0 is always wasted).  Entries can be NULL when an RTE
	 * does not correspond to a base relation, such as a join RTE or an
	 * unreferenced view RTE; or if the RelOptInfo hasn't been made yet.
	 */
	struct RelOptInfo **simple_rel_array;	/* All 1-rel RelOptInfos */
	int		simple_rel_array_size;	/* allocated size of array */

	/*
	 * simple_rte_array is the same length as simple_rel_array and holds
	 * pointers to the associated rangetable entries.  This lets us avoid
	 * rt_fetch(), which can be a bit slow once large inheritance sets have
	 * been expanded.
	 */
	RangeTblEntry **simple_rte_array;	/* rangetable as an array */

	/*
	 * all_baserels is a Relids set of all base relids (but not "other"
	 * relids) in the query; that is, the Relids identifier of the final join
	 * we need to form.  This is computed in make_one_rel, just before we
	 * start making Paths.
	 */
	Relids		all_baserels;

	/*
	 * nullable_baserels is a Relids set of base relids that are nullable by
	 * some outer join in the jointree; these are rels that are potentially
	 * nullable below the WHERE clause, SELECT targetlist, etc.  This is
	 * computed in deconstruct_jointree.
	 */
	Relids		nullable_baserels;

	/*
	 * join_rel_list is a list of all join-relation RelOptInfos we have
	 * considered in this planning run.  For small problems we just scan the
	 * list to do lookups, but when there are many join relations we build a
	 * hash table for faster lookups.  The hash table is present and valid
	 * when join_rel_hash is not NULL.  Note that we still maintain the list
	 * even when using the hash table for lookups; this simplifies life for
	 * GEQO.
	 */
	List		*join_rel_list;	/* list of join-relation RelOptInfos */
	struct HTAB 	*join_rel_hash; /* optional hashtable for join relations */

	/*
	 * When doing a dynamic-programming-style join search, join_rel_level[k]
	 * is a list of all join-relation RelOptInfos of level k, and
	 * join_cur_level is the current level.  New join-relation RelOptInfos are
	 * automatically added to the join_rel_level[join_cur_level] list.
	 * join_rel_level is NULL if not in use.
	 */
	List	**join_rel_level;	/* lists of join-relation RelOptInfos */
	int	join_cur_level; 	/* index of list being extended */
	List	*init_plans;		/* init SubPlans for query */
	List	*cte_plan_ids;		/* per-CTE-item list of subplan IDs */
	List	*multiexpr_params;	/* List of Lists of Params for MULTIEXPR subquery outputs */
	List	*eq_classes;		/* list of active EquivalenceClasses */
	List	*canon_pathkeys; 	/* list of "canonical" PathKeys */
	List	*left_join_clauses;	/* list of RestrictInfos for
					 * mergejoinable outer join clauses w/nonnullable var on left */
	List	*right_join_clauses;	/* list of RestrictInfos for
					 * mergejoinable outer join clauses w/nonnullable var on right */
	List	*full_join_clauses;	/* list of RestrictInfos for mergejoinable full join clauses */
	List	*join_info_list; 	/* list of SpecialJoinInfos */
	List	*append_rel_list;	/* list of AppendRelInfos */
	List	*rowMarks;		/* list of PlanRowMarks */
	List	*placeholder_list;	/* list of PlaceHolderInfos */
	List	*fkey_list;		/* list of ForeignKeyOptInfos */
	List	*query_pathkeys; 	/* desired pathkeys for query_planner() */
	List	*group_pathkeys; 	/* groupClause pathkeys, if any */
	List	*window_pathkeys;	/* pathkeys of bottom window, if any */
	List	*distinct_pathkeys;	/* distinctClause pathkeys, if any */
	List	*sort_pathkeys;		/* sortClause pathkeys, if any */
	List	*initial_rels;		/* RelOptInfos we are now trying to join */

	/* Use fetch_upper_rel() to get any particular upper rel */
	List	*upper_rels[UPPERREL_FINAL + 1]; /* upper-rel RelOptInfos */

	/* Result tlists chosen by grouping_planner for upper-stage processing */
	struct PathTarget *upper_targets[UPPERREL_FINAL + 1];

	/*
	 * grouping_planner passes back its final processed targetlist here, for
	 * use in relabeling the topmost tlist of the finished Plan.
	 */
	List    *processed_tlist;

	/* Fields filled during create_plan() for use in setrefs.c */
	AttrNumber *grouping_map;	/* for GroupingFunc fixup */
	List	   *minmax_aggs;	/* List of MinMaxAggInfos */
	MemoryContext planner_cxt;	/* context holding PlannerInfo */
	double	   total_table_pages;	/* # of pages in all tables of query */
	double	   tuple_fraction; 	/* tuple_fraction passed to query_planner */
	double	   limit_tuples;	/* limit_tuples passed to query_planner */
	bool	   hasInheritedTarget;	/* true if parse->resultRelation is an inheritance child rel */
	bool	   hasJoinRTEs;		/* true if any RTEs are RTE_JOIN kind */
	bool	   hasLateralRTEs; 	/* true if any RTEs are marked LATERAL */
	bool	   hasDeletedRTEs; 	/* true if any RTE was deleted from jointree */
	bool	   hasHavingQual;	/* true if havingQual was non-null */
	bool	   hasPseudoConstantQuals; /* true if any RestrictInfo has pseudoconstant = true */
	bool	   hasRecursion;	/* true if planning a recursive WITH item */

	/* These fields are used only when hasRecursion is true: */
	int	   wt_param_id;	        /* PARAM_EXEC ID for the work table */
	struct Path *non_recursive_path;/* a path for non-recursive term */

	/* These fields are workspace for createplan.c */
	Relids	   curOuterRels;	/* outer rels above current node */
	List	   *curOuterParams; 	/* not-yet-assigned NestLoopParams */

	/* optional private data for join_search_hook, e.g., GEQO */
	void	   *join_search_private;
} PlannerInfo;

 

Path

Path 타입은 추가 정보가 필요하지 않은 단순한 플랜 (ex> sequential scan path)에 이용되는 타입이다.

  • pathtype

    • NodeTag of the Plan node we could build from this Path
    • tag identifying scan/join method
  • param_info

    • "param_info", if not NULL, links to a ParamPathInfo that identifies outer relation(s) that provide parameter values to each scan of this path.
    • That means this path can only be joined to those rels by means of nestloop joins with this path on the inside. Also note that a parameterized path is responsible for testing all "movable" joinclauses involving this rel and the specified outer rel(s).
  • pathkeys

    • List of PathKey nodes
    • Path의 Output rows의 정렬 순서를 나타냄
typedef struct Path
{
  NodeTag   type;
  NodeTag   pathtype;   /* tag identifying scan/join method */
  RelOptInfo *parent;     /* the relation this path can build */
  ParamPathInfo *param_info;  /* parameterization info, or NULL if none */

/* estimated size/costs for path (see costsize.c for more info) */
  double    rows;     /* estimated number of result tuples */
  Cost    startup_cost; /* cost expended before fetching any tuples */
  Cost    total_cost;   /* total cost (assuming all tuples fetched) */

  List     *pathkeys;   /* sort ordering of path's output */
  /* pathkeys is a List of PathKey nodes; see above */
} Path;

 

 

저작자표시 비영리 변경금지 (새창열림)

'Database > PostgreSQL' 카테고리의 다른 글

PostgreSQL 9.3.4 설치 및 초기화  (0) 2023.01.09
[Optimizer] Correlation of PostgreSQL  (0) 2021.03.08
PostgreSQL에서 TPC-H 사용하기  (0) 2018.04.10
    'Database/PostgreSQL' 카테고리의 다른 글
    • PostgreSQL 9.3.4 설치 및 초기화
    • [Optimizer] Correlation of PostgreSQL
    • PostgreSQL에서 TPC-H 사용하기
    컴라면
    컴라면

    티스토리툴바